DC Field | Value | Language |
---|---|---|
dc.contributor.author | Solèr, Matthias | ko |
dc.contributor.author | Bazin, Jean-Charles | ko |
dc.contributor.author | Wang, Oliver | ko |
dc.contributor.author | Krause, Andreas | ko |
dc.contributor.author | Sorkine-Hornung, Alexander | ko |
dc.date.accessioned | 2017-09-08T05:34:10Z | - |
dc.date.available | 2017-09-08T05:34:10Z | - |
dc.date.created | 2017-09-04 | - |
dc.date.created | 2017-09-04 | - |
dc.date.issued | 2016-10-08 | - |
dc.identifier.citation | 14th European Conference on Computer Vision, ECCV 2016, pp.900 - 917 | - |
dc.identifier.uri | http://hdl.handle.net/10203/225727 | - |
dc.description.abstract | Given a still image, humans can easily think of a sound associated with this image. For instance, people might associate the picture of a car with the sound of a car engine. In this paper we aim to retrieve sounds corresponding to a query image. To solve this challenging task, our approach exploits the correlation between the audio and visual modalities in video collections. A major difficulty is the high amount of uncorrelated audio in the videos, i.e., audio that does not correspond to the main image content, such as voice-over, background music, added sound effects, or sounds originating off-screen. We present an unsupervised, clustering-based solution that is able to automatically separate correlated sounds from uncorrelated ones. The core algorithm is based on a joint audio-visual feature space, in which we perform iterated mutual kNN clustering in order to effectively filter out uncorrelated sounds. To this end we also introduce a new dataset of correlated audio-visual data, on which we evaluate our approach and compare it to alternative solutions. Experiments show that our approach can successfully deal with a high amount of uncorrelated audio. | - |
dc.language | English | - |
dc.publisher | European Conference on Computer Vision Committee | - |
dc.title | Suggesting sounds for images from video collections | - |
dc.type | Conference | - |
dc.identifier.wosid | 000389501700059 | - |
dc.identifier.scopusid | 2-s2.0-84996931564 | - |
dc.type.rims | CONF | - |
dc.citation.beginningpage | 900 | - |
dc.citation.endingpage | 917 | - |
dc.citation.publicationname | 14th European Conference on Computer Vision, ECCV 2016 | - |
dc.identifier.conferencecountry | NE | - |
dc.identifier.conferencelocation | Oudemanhuispoort, University of Amsterdam | - |
dc.identifier.doi | 10.1007/978-3-319-48881-3_59 | - |
dc.contributor.localauthor | Bazin, Jean-Charles | - |
dc.contributor.nonIdAuthor | Solèr, Matthias | - |
dc.contributor.nonIdAuthor | Wang, Oliver | - |
dc.contributor.nonIdAuthor | Krause, Andreas | - |
dc.contributor.nonIdAuthor | Sorkine-Hornung, Alexander | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.