DSpace at KOASAS: Towards end-to-end generative modeling of long videos with memory-efficient bidirectional transformers

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Theses_Master(석사논문)

Towards end-to-end generative modeling of long videos with memory-efficient bidirectional transformers메모리 효율적 양방향 트랜스포머를 활용한 긴 비디오의 엔드 투 엔드 생성 모델링 연구

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 1
Download : 0

Export

Yoo, Jaehoon / 유재훈

Autoregressive transformers have shown remarkable success in video generation. However, the transformers are prohibited from directly learning the long-term dependency in videos due to the quadratic complexity of self-attention, and inherently suffering from slow inference time and error propagation due to the autoregressive process. In this paper, we propose Memory-efficient Bidirectional Transformer (MeBT) for end-to-end learning of long-term dependency in videos and fast inference. Based on recent advances in bidirectional transformers, our method learns to decode the entire spatio-temporal volume of a video in parallel from partially observed patches. The proposed transformer achieves a linear time complexity in both encoding and decoding, by projecting observable context tokens into a fixed number of latent tokens and conditioning them to decode the masked tokens through the cross-attention. Empowered by linear complexity and bidirectional modeling, our method demonstrates significant improvement over the autoregressive transformers for generating moderately long videos in both quality and speed.

Advisors: 홍승훈 researcher

Description: 한국과학기술원 :전산학부,

Publisher: 한국과학기술원

Issue Date: 2023

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 전산학부, 2023.8,[iv, 26 p. :]

Keywords: 비디오 생성 모델링▼a양방향 트랜스포머▼a메모리 효율▼a내재 변수 압축; Generative modeling of videos▼abidirectional transformer▼amemory efficiency▼alatent bottleneck

URI: http://hdl.handle.net/10203/320727

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1045959&flag=dissertation

Appears in Collection: CS-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Towards end-to-end generative modeling of long videos with memory-efficient bidirectional transformers메모리 효율적 양방향 트랜스포머를 활용한 긴 비디오의 엔드 투 엔드 생성 모델링 연구

KOASAS

Communities & Collections