DSpace at KOASAS: Zero-crossing-based sound source localization, segregation and recognition

DSpace at KOASAS

College of Natural Sciences(자연과학대학)Dept. of Mathematical Sciences(수리과학과)MA-Theses_Ph.D.(박사논문)

Zero-crossing-based sound source localization, segregation and recognition영교차점에 기초한 음원의 방향 탐지, 분리 및 인식

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 860
Download : 0

Export

An, Sung-Jun / 안성준

This thesis presents some new methods of spatial hearing Algorithm. The first one is zero-crossing-based sound source localization with precedence effect in severely reverberant conditions. And the second one is binaural mask estimation for sound segregation and recognition under the condition that multiple sound sources are present simultaneously. The precedence effect is a psychoacoustic effect related to a group of auditory phenomena. Especially under reverberant condition, when various similar sounds originated from one or more sources at different location from the listener, the direct sound arrived first and it is also heard first. To the listener, this creates the impression that the sound comes from that location alone due to a phenomenon and suppress the perception of later arrivals. By adapting this precedence effect to our sound source localization algorithm, we can get very good simulation results in sound localization under severely reverberant condition. For sound segregation and recognition, we use a ratio masking method. The masking is determined by the estimated sound source directions using the spatial cues such as inter-aural time differences (ITDs) and inter-aural intensity differences (IIDs). In the suggested method, the estimation of ITDs is utilizing the statistical properties of zero-crossings detected from binaural filter-bank outputs. We also consider the estimation of ITDs with the aid of IID samples to cope with the phase ambiguities of ITD estimates in high frequencies. For the masking method, we consider using the power ratio of the target to interference sources. We show that this power ratio is optimal from the view point of reconstructing the target speech signal and is effectively used in missing data speech recognition. To estimate the power ratio, the expectation and maximization (EM) method is used for ITD estimates. As a result, the proposed method is able to provide the better masking scheme for speech segregation and...

Advisors: Kim, Sung-Ho researcher; 김성호 researcher; Kil, Rhee-Man researcher; 길이만 researcher

Description: 한국과학기술원 : 수리과학과,

Publisher: 한국과학기술원

Issue Date: 2010

Identifier: 418773/325007 / 020045146

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 수리과학과, 2010.2, [ ix, 87 p. ]

Keywords: 반향; 음성 인식; 음원 방향 탐지; 음성 분리; 영교차점; Reverberation; Speech Recognition; Speech Segregation; Sound Source Localization; Zero-Crossing

URI: http://hdl.handle.net/10203/41935

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=418773&flag=dissertation

Appears in Collection: MA-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Zero-crossing-based sound source localization, segregation and recognition영교차점에 기초한 음원의 방향 탐지, 분리 및 인식

KOASAS

Communities & Collections