DSpace at KOASAS: On improving acoustic modeling in speech recognition based on continuous density HMM

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Ph.D.(박사논문)

On improving acoustic modeling in speech recognition based on continuous density HMM연속밀도 HMM을 이용한 음성인식에서의 음향모델링 개선

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 387
Download : 0

Export

Kwon, Oh-Wook / 권오욱

Recently, hidden Markov model (HMM) has become the predominant approach to speech recognition. Although the conventional HMM is good at modeling the stationary and sequential characteristics of speech signals, it has inherent drawbacks of poor duration modeling and weak discrimination capability between competing classes. In this dissertation work, we present various methods to improve acoustic modeling in speech recognition based on continuous density HMM. First, we propose to model and incorporate context-dependent word duration information to reduce insertion and deletion errors in connected digit recognizers. The proposed method is different from the conventional postprocessing-based method in that it is incorporated directly in the Viterbi decoding algorithm. Experimental results show that the proposed method reduces word error rates by as much as 10% for unknown length decoding, while the postprocessing method does not achieve significant improvements over a baseline system. Simple duration modeling by a bounded uniform distribution achieves performance improvements comparable to detailed duration modeling by a gamma or Gaussian distribution with low complexity, and therefore it is a good compromise between performance and complexity. Second, we propose a supersegment-based postprocessing approach to improve recognition accuracies for connected digit recognition. A supersegment for a string means a concatenation of one or more segments sharing similar begin- and end-points with the other strings within some tolerances. In the approach, N-best candidate strings are generated by a conventional recognizer and string-matched so that they are all represented by the same number of supersegments. We obtain total log likelihoods by combining the conventional first-stage recognizer and a supersegment-based second-stage postprocessor. Experimental results show that connected digit recognizers by the supersegment-based postprocessing method achieves about 20% decr...

Advisors: Un, Chong-Kwan researcher; 은종관 researcher

Description: 한국과학기술원 : 전기및전자공학과,

Publisher: 한국과학기술원

Issue Date: 1997

Identifier: 114110/325007 / 000925027

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 전기및전자공학과, 1997.2, [ vii, 110 p. ]

Keywords: HMM; Speech recognition; Hidden Markov model; 음향모델링; 은닉 마코프 모델; 음성인식; Acoustic modeling

URI: http://hdl.handle.net/10203/36365

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=114110&flag=dissertation

Appears in Collection: EE-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

On improving acoustic modeling in speech recognition based on continuous density HMM연속밀도 HMM을 이용한 음성인식에서의 음향모델링 개선

KOASAS

Communities & Collections