DSpace at KOASAS: Global-and-Local Relative Position Embedding for Unsupervised Video Summarization

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Conference Papers(학술회의논문)

Global-and-Local Relative Position Embedding for Unsupervised Video Summarization

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 229
Download : 0

Export

Jung, Yunjae / Cho, Donghyeon / Woo, Sanghyun / Kweon, In-So researcher

In order to summarize a content video properly, it is important to grasp the sequential structure of video as well as the long-term dependency between frames. The necessity of them is more obvious, especially for unsupervised learning. One possible solution is to utilize a well-known technique in the field of natural language processing for long-term dependency and sequential property: self-attention with relative position embedding (RPE). However, compared to natural language processing, video summarization requires capturing a much longer length of the global context. In this paper, we therefore present a novel input decomposition strategy, which samples the input both globally and locally. This provides an effective temporal window for RPE to operate and improves overall computational efficiency significantly. By combining both Global-and-Local input decomposition and RPE together, we come up with GL-RPE. Our approach allows the network to capture both local and global interdependencies between video frames effectively. Since GL-RPE can be easily integrated into the existing methods, we apply it to two different unsupervised backbones. We provide extensive ablation studies and visual analysis to verify the effectiveness of the proposals. We demonstrate our approach achieves new state-of-the-art performance using the recently proposed rank order-based metrics: Kendall’s τ and Spearman’s ρ . Furthermore, despite our method is unsupervised, we show ours perform on par with the fully-supervised method.

Publisher: European Conference on Computer Vision

Issue Date: 2020-08

Language: English

Citation: European Conference on Computer Vision, ECCV 2020

URI: http://hdl.handle.net/10203/278566

Appears in Collection: EE-Conference Papers(학술회의논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Global-and-Local Relative Position Embedding for Unsupervised Video Summarization

KOASAS

Communities & Collections