DSpace at KOASAS: Suggesting sounds for images from video collections

DSpace at KOASAS

College of Liberal Arts and Convergence Science(인문사회융합과학대학)Graduate School of Culture Technology(문화기술대학원)GCT-Conference Papers(학술회의논문)

Suggesting sounds for images from video collections

Cited 6 time in

Cited 0 time in

Hit : 339
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Solèr, Matthias	ko
dc.contributor.author	Bazin, Jean-Charles	ko
dc.contributor.author	Wang, Oliver	ko
dc.contributor.author	Krause, Andreas	ko
dc.contributor.author	Sorkine-Hornung, Alexander	ko
dc.date.accessioned	2017-09-08T05:34:10Z	-
dc.date.available	2017-09-08T05:34:10Z	-
dc.date.created	2017-09-04	-
dc.date.created	2017-09-04	-
dc.date.issued	2016-10-08	-
dc.identifier.citation	14th European Conference on Computer Vision, ECCV 2016, pp.900 - 917	-
dc.identifier.uri	http://hdl.handle.net/10203/225727	-
dc.description.abstract	Given a still image, humans can easily think of a sound associated with this image. For instance, people might associate the picture of a car with the sound of a car engine. In this paper we aim to retrieve sounds corresponding to a query image. To solve this challenging task, our approach exploits the correlation between the audio and visual modalities in video collections. A major difficulty is the high amount of uncorrelated audio in the videos, i.e., audio that does not correspond to the main image content, such as voice-over, background music, added sound effects, or sounds originating off-screen. We present an unsupervised, clustering-based solution that is able to automatically separate correlated sounds from uncorrelated ones. The core algorithm is based on a joint audio-visual feature space, in which we perform iterated mutual kNN clustering in order to effectively filter out uncorrelated sounds. To this end we also introduce a new dataset of correlated audio-visual data, on which we evaluate our approach and compare it to alternative solutions. Experiments show that our approach can successfully deal with a high amount of uncorrelated audio.	-
dc.language	English	-
dc.publisher	European Conference on Computer Vision Committee	-
dc.title	Suggesting sounds for images from video collections	-
dc.type	Conference	-
dc.identifier.wosid	000389501700059	-
dc.identifier.scopusid	2-s2.0-84996931564	-
dc.type.rims	CONF	-
dc.citation.beginningpage	900	-
dc.citation.endingpage	917	-
dc.citation.publicationname	14th European Conference on Computer Vision, ECCV 2016	-
dc.identifier.conferencecountry	NE	-
dc.identifier.conferencelocation	Oudemanhuispoort, University of Amsterdam	-
dc.identifier.doi	10.1007/978-3-319-48881-3_59	-
dc.contributor.localauthor	Bazin, Jean-Charles	-
dc.contributor.nonIdAuthor	Solèr, Matthias	-
dc.contributor.nonIdAuthor	Wang, Oliver	-
dc.contributor.nonIdAuthor	Krause, Andreas	-
dc.contributor.nonIdAuthor	Sorkine-Hornung, Alexander	-

Appears in Collection: GCT-Conference Papers(학술회의논문)

Files in This Item: There are no files associated with this item.

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 6 items in WoS	Click to see citing articles in

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Suggesting sounds for images from video collections

This item is cited by other documents in WoS

KOASAS

Communities & Collections