DSpace at KOASAS: Audio-visual learning with semantically similar samples

DSpace at KOASAS

College of Engineering(공과대학)Division of Future Vehicle(미래자동차 학제전공)PD-Theses_Master(석사논문)

Audio-visual learning with semantically similar samples의미론적 유사성을 이용한 청각-시각 연관학습

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 427
Download : 0

Export

Ryu, Hyeonggon

Instance discrimination-based contrastive learning is the learning method that contrasts the positive and negative pair. It assumes that the negative pair should contain different semantic information. However, the assumption only holds because of the random construction of the training batch. Intuitively, this faulty negative pair disturb the training and degrade the model performance. This work aims to solve the faulty negative problem for in- stance discrimination-based Audio-Visual Learning. Existing audio-visual works employ contrastive learning by assigning corresponding audio-visual pairs from the same source as positive while randomly mismatched pairs as negatives. As aforementioned general instance discrimination-based contrastive learning, these negative pairs may contain semantically matched audio-visual information. The key contribution of this work is showing that semantically similar samples can compensate for the effect of faulty negative pairs. Our approach incorporates semantically similar samples into a contrastive learning objective directly. It is applied to two tasks: Audio-Visual Sound Source Localization and Visually Grounded Speech. We demonstrate the effectiveness of our approach to the tasks.

Advisors: Kweon, In So researcher; 권인소 researcher

Description: 한국과학기술원 :미래자동차학제전공,

Publisher: 한국과학기술원

Issue Date: 2023

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 미래자동차학제전공, 2023.2,[v, 30 p. :]

Keywords: Audio-visual learning▼aSound source localization▼aVisually grounded speech; 청각-시각연관학습▼a음원위치탐색▼a음성의시각적이해

URI: http://hdl.handle.net/10203/308327

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1032366&flag=dissertation

Appears in Collection: PD-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Audio-visual learning with semantically similar samples의미론적 유사성을 이용한 청각-시각 연관학습

KOASAS

Communities & Collections