An improvement of stochastic feature extraction for robust speech recognition잡음에 강인한 음성인식을 위한 통계적 특징벡터 추출방법 개선

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 410
  • Download : 0
The speech recognizer running in the real world is considerably influenced by noise. The speech recognizer trained by the clean speech cannot well recognize a speech obtained under the noisy environments because the noise brings mismatches between the training and test environments. Therefore, it is necessary to compensate these mismatches for noise robust speech recognition. In this thesis, we studied about an improvement of stochastic feature extraction based on band-SNR for noise robust speech recognition. We proposed a slightly-modified version of the multi-band spectral subtraction method that adjusts the subtraction level of noise spectrum according to band-SNR, which is noted as M-MSS. Also, we modified the architecture of the stochastic feature extraction method, which is noted as M-SFE. Then, we proposed a stochastic feature extraction method combining two methods above. It is to use advantages of two methods to reliably consider the effect of noise. In the M-MSS, a noise normalization factor was newly introduced in order to play a role in controlling the over-estimation factor depending on band-SNR. As a result, we could more reliably adjust the subtraction level of noise spectrum. We could get a better performance when the spectral subtraction was applied in the power spectrum domain than in the mel-scale domain. Last, we applied the framework of stochastic feature extraction method to the modified multi-band spectral subtraction method. The proposed method, which is denoted as the MMSS-MSFE method, could more effectively compensate variations of noise spectrum by estimating optimal spectrum of clean speech and using the mean and variance of stochastic features. The proposed methods were evaluated on isolated word recognition under various noise environments. When we used only mean of stochastic feature, the average error rates of the M-MSS, M-SFE, MMSS-MSFE method over the ordinary spectral subtraction (SS) method were reduced with 18.6%, 11.0%,...
Advisors
Kim, Hoi-Rinresearcher김회린researcher
Description
한국정보통신대학원대학교 : 공학부,
Publisher
한국정보통신대학교
Issue Date
2004
Identifier
392313/225023 / 020023964
Language
eng
Description

학위논문(석사) - 한국정보통신대학원대학교 : 공학부, 2004, [ ix, 45 p. ]

Keywords

Stochastic Feature Extraction; Robust Speech Recognition

URI
http://hdl.handle.net/10203/55229
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=392313&flag=dissertation
Appears in Collection
School of Engineering-Theses_Master(공학부 석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0