DSpace at KOASAS: HARP: Autoregressive Latent Video Prediction with High-Fidelity Image Generator

DSpace at KOASAS

College of Engineering(공과대학)Kim Jaechul Graduate School of AI(김재철AI대학원)AI-Conference Papers(학술대회논문)

HARP: Autoregressive Latent Video Prediction with High-Fidelity Image Generator

Cited 5 time in

Cited 0 time in

Hit : 42
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Seo, Younggyo	ko
dc.contributor.author	Lee, Kimin	ko
dc.contributor.author	Liu, Fangchen	ko
dc.contributor.author	James, Stephen	ko
dc.contributor.author	Abbeel, Pieter	ko
dc.date.accessioned	2023-12-12T10:01:14Z	-
dc.date.available	2023-12-12T10:01:14Z	-
dc.date.created	2023-12-08	-
dc.date.issued	2022-10	-
dc.identifier.citation	29th IEEE International Conference on Image Processing, ICIP 2022, pp.3943 - 3947	-
dc.identifier.issn	1522-4880	-
dc.identifier.uri	http://hdl.handle.net/10203/316315	-
dc.description.abstract	Video prediction is an important yet challenging problem; burdened with the tasks of generating future frames and learning environment dynamics. Recently, autoregressive latent video models have proved to be a powerful video prediction tool, by separating the video prediction into two sub-problems: pre-training an image generator model, followed by learning an autoregressive prediction model in the latent space of the image generator. However, successfully generating high-fidelity and high-resolution videos has yet to be seen. In this work, we investigate how to train an autoregressive latent video prediction model capable of predicting high-fidelity future frames with minimal modification to existing models, and produce high-resolution (256x256) videos. Specifically, we scale up prior models by employing a high-fidelity image generator (VQ-GAN) with a causal transformer model, and introduce additional techniques of top-k sampling and data augmentation to further improve video prediction quality. Despite the simplicity, the proposed method achieves competitive performance to state-of-the-art approaches on standard video prediction benchmarks with fewer parameters, and enables high-resolution video prediction on complex and large-scale datasets.	-
dc.language	English	-
dc.publisher	IEEE International Conference on Image Processing	-
dc.title	HARP: Autoregressive Latent Video Prediction with High-Fidelity Image Generator	-
dc.type	Conference	-
dc.identifier.wosid	001058109504005	-
dc.identifier.scopusid	2-s2.0-85139902097	-
dc.type.rims	CONF	-
dc.citation.beginningpage	3943	-
dc.citation.endingpage	3947	-
dc.citation.publicationname	29th IEEE International Conference on Image Processing, ICIP 2022	-
dc.identifier.conferencecountry	FR	-
dc.identifier.conferencelocation	Bordeaux	-
dc.identifier.doi	10.1109/ICIP46576.2022.9897982	-
dc.contributor.localauthor	Lee, Kimin	-
dc.contributor.nonIdAuthor	Liu, Fangchen	-
dc.contributor.nonIdAuthor	James, Stephen	-
dc.contributor.nonIdAuthor	Abbeel, Pieter	-

Appears in Collection: AI-Conference Papers(학술대회논문)

Files in This Item: There are no files associated with this item.

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 5 items in WoS	Click to see citing articles in

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

HARP: Autoregressive Latent Video Prediction with High-Fidelity Image Generator

This item is cited by other documents in WoS

KOASAS

Communities & Collections