Enhanced cepstral representations and distance measures for speech recognition음성인식을 위한 개선된 켑스트럼 표현 및 거리비교에 관한 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 465
  • Download : 0
The mismatched condition between training and testing environments of speech recognition severely degrades the performances of most speech recognizers. In this dissertation work, we propose new algorithms related to noise-resistant representations of speech and distance measures. First, we propose new speech analysis algorithm called the spectral envelope linear predictive analysis (SELP). SELP is based on the spectral autocorrelation which is defined as the autocorrelation of discrete quantities of the speech spectrum with the spectral resolution identical to the point of the discrete Fourier transform (DFT) of speech. We prove that the spectral autocorrelation of voiced speech is periodic with a period of the fundamental frequency ($F_0$) and has maximum values at multiples of $F_0$ assuming that its spectral envelope is slowly varying. The spectral envelope is estimated by sampling the speech spectrum at peak points of spectral autocorrelation. Also we can obtain the spectral envelope of unvoiced speech from the observation that its spectral autocorrelation shows periodicity. SELP has the advantage of estimating the spectral envelope without explicit $F_0$ detection and voicing decision. The resultant spectral envelope, whose length is reduced by the factor of about $F_0$, is normalized linearly in frequency to obtain the same frequency resolution over each analysis frame. Then, we obtain nonlinear spectral resolution by transforming the frequency axis of the spectral envelope into a mel-frequency one. The inverse DFT of this spectral envelope yields the estimate of sample autocorrelation of speech. So we can obtain the cepstral coefficients from the sample autocorrelation and call them the spectral envelope cepstral coefficients (SECC). Recognition experiments show that SECC combined with the bandpass lifter yields higher or comparable performance than the conventional representations such as the perceptual linear predictive analysis (PLP), the short-time ...
Advisors
Lee, Hwang-Sooresearcher이황수researcher
Description
한국과학기술원 : 정보 및 통신공학과,
Publisher
한국과학기술원
Issue Date
1994
Identifier
69751/325007 / 000885124
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 정보 및 통신공학과, 1994.8, [ xvi, 145 p. ]

URI
http://hdl.handle.net/10203/39811
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=69751&flag=dissertation
Appears in Collection
ICE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0