DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | 홍승훈 | - |
dc.contributor.author | Yoo, Jaehoon | - |
dc.contributor.author | 유재훈 | - |
dc.date.accessioned | 2024-07-25T19:31:25Z | - |
dc.date.available | 2024-07-25T19:31:25Z | - |
dc.date.issued | 2023 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1045959&flag=dissertation | en_US |
dc.identifier.uri | http://hdl.handle.net/10203/320727 | - |
dc.description | 학위논문(석사) - 한국과학기술원 : 전산학부, 2023.8,[iv, 26 p. :] | - |
dc.description.abstract | Autoregressive transformers have shown remarkable success in video generation. However, the transformers are prohibited from directly learning the long-term dependency in videos due to the quadratic complexity of self-attention, and inherently suffering from slow inference time and error propagation due to the autoregressive process. In this paper, we propose Memory-efficient Bidirectional Transformer (MeBT) for end-to-end learning of long-term dependency in videos and fast inference. Based on recent advances in bidirectional transformers, our method learns to decode the entire spatio-temporal volume of a video in parallel from partially observed patches. The proposed transformer achieves a linear time complexity in both encoding and decoding, by projecting observable context tokens into a fixed number of latent tokens and conditioning them to decode the masked tokens through the cross-attention. Empowered by linear complexity and bidirectional modeling, our method demonstrates significant improvement over the autoregressive transformers for generating moderately long videos in both quality and speed. | - |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | 비디오 생성 모델링▼a양방향 트랜스포머▼a메모리 효율▼a내재 변수 압축 | - |
dc.subject | Generative modeling of videos▼abidirectional transformer▼amemory efficiency▼alatent bottleneck | - |
dc.title | Towards end-to-end generative modeling of long videos with memory-efficient bidirectional transformers | - |
dc.title.alternative | 메모리 효율적 양방향 트랜스포머를 활용한 긴 비디오의 엔드 투 엔드 생성 모델링 연구 | - |
dc.type | Thesis(Master) | - |
dc.identifier.CNRN | 325007 | - |
dc.description.department | 한국과학기술원 :전산학부, | - |
dc.contributor.alternativeauthor | Hong, Seunghoon | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.