On improving acoustic modeling in speech recognition based on continuous density HMM연속밀도 HMM을 이용한 음성인식에서의 음향모델링 개선

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 387
  • Download : 0
Recently, hidden Markov model (HMM) has become the predominant approach to speech recognition. Although the conventional HMM is good at modeling the stationary and sequential characteristics of speech signals, it has inherent drawbacks of poor duration modeling and weak discrimination capability between competing classes. In this dissertation work, we present various methods to improve acoustic modeling in speech recognition based on continuous density HMM. First, we propose to model and incorporate context-dependent word duration information to reduce insertion and deletion errors in connected digit recognizers. The proposed method is different from the conventional postprocessing-based method in that it is incorporated directly in the Viterbi decoding algorithm. Experimental results show that the proposed method reduces word error rates by as much as 10% for unknown length decoding, while the postprocessing method does not achieve significant improvements over a baseline system. Simple duration modeling by a bounded uniform distribution achieves performance improvements comparable to detailed duration modeling by a gamma or Gaussian distribution with low complexity, and therefore it is a good compromise between performance and complexity. Second, we propose a supersegment-based postprocessing approach to improve recognition accuracies for connected digit recognition. A supersegment for a string means a concatenation of one or more segments sharing similar begin- and end-points with the other strings within some tolerances. In the approach, N-best candidate strings are generated by a conventional recognizer and string-matched so that they are all represented by the same number of supersegments. We obtain total log likelihoods by combining the conventional first-stage recognizer and a supersegment-based second-stage postprocessor. Experimental results show that connected digit recognizers by the supersegment-based postprocessing method achieves about 20% decr...
Advisors
Un, Chong-Kwanresearcher은종관researcher
Description
한국과학기술원 : 전기및전자공학과,
Publisher
한국과학기술원
Issue Date
1997
Identifier
114110/325007 / 000925027
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전기및전자공학과, 1997.2, [ vii, 110 p. ]

Keywords

HMM; Speech recognition; Hidden Markov model; 음향모델링; 은닉 마코프 모델; 음성인식; Acoustic modeling

URI
http://hdl.handle.net/10203/36365
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=114110&flag=dissertation
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0