Environmental Audio Scene and Activity Recognition through Mobile-based Crowdsourcing

Cited 22 time in webofscience Cited 0 time in scopus
  • Hit : 477
  • Download : 48
Environmental audio recognition through mobile devices is difficult because of background noise, unseen audio events, and changes in audio channel characteristics due to the phone's context, e.g., whether the phone is in the user's pocket or in his hand. We propose a crowdsourcing framework that models the combination of scene, event, and phone context to overcome these issues. The framework gathers audio data from many people and shares user-generated models through a cloud server to accurately classify unseen audio data. A Gaussian histogram is used to represent an audio clip with a small number of parameters, and a k-nearest classifier allows the easy incorporation of new training data into the system. Using the Kullback-Leibler divergence between two Gaussian histograms as the distance measure, we find that audio scenes, events, and phone context are classified with 85.2%, 77.6%, and 88.9% accuracy, respectively.
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Issue Date
2012-05
Language
English
Article Type
Article
Citation

IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, v.58, no.2, pp.700 - 705

ISSN
0098-3063
URI
http://hdl.handle.net/10203/103916
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 22 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0