Noise robust speech recognition using kernel-based top-down selective attention커널 기반 하향식 주의집중 모델을 이용한 잡음에 강인한 음성인식

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 469
  • Download : 0
A top-down selective attention model which is brought from psychological researches is proposed to recognize isolated word in noisy environments. This model is applied to a hidden Markov model (HMM) classifier which is widely used for automatic speech recognition. An attention filter is introduced in the output of the Mel-filterbank, whose shapes are similar to cochlear filterbank where human attention might be processed. The attention filter is adapted by changing its gain in order to maximize the log likelihood of an attended testing input speech. However, while the log likelihood of the attended input to the selected model increases, any input signal can be attended to any model, then, the attention process produces over-fitted attended data. A low-complexity constraint was proposed to prevent the attention filter from over-fitting. The first method utilizes bilinear kernels which map attention filter to the lower resolution subspace to reduce the complexity of the attention filter effectively. The experiments were done with different sizes of grid with different level of white Gaussian noise. The recognition results are improved. The false recognition rates are 41% and 54% with 20dB SNR and 15dB SNR, respectively. However, the attention filter with bilinear kernels is restricted to model attention in some cases since the peak values in attention filter can be oriented at the grid position. So the model have to have the mechanism to find proper center position and width of the receptive field. Another candidate to reduce the complexity of an attention filter utilizes Gaussian kernels which are adapted not only weights but also the position of the center and the width of the receptive field. The attention filter with Gaussian kernel is adapted by gradient methods. The false recognition rates of this attention filter are 36% and 46% decrease in 20dB SNR and 15dB SNR, respectively. Although The bilinear model shows better p...
Advisors
Lee, Soo-Youngresearcher이수영researcher
Description
한국과학기술원 : 전기및전자공학전공,
Publisher
한국과학기술원
Issue Date
2006
Identifier
258130/325007  / 020015227
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전기및전자공학전공, 2006.8, [ x, 83 p. ]

Keywords

speech recognition; HMM; selective attention; low resolution constraint; 확신척도; 저해상도 제한; 음성인식; 선택적 주의집중; confidence measure

URI
http://hdl.handle.net/10203/36062
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=258130&flag=dissertation
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0