Dynamic weighting scheme based N-gram adaptation for large vocabulary continuous speech recognition = 동적 가중법에 기반한 N-gram 적응을 이용한 대어휘 연속음성인식

Large vocabulary continuous speech recognition (LVCSR) is that machine recognizes natural speech with a very large or practically unlimited vocabulary spoken in a free flowing manner. The general architecture of modern LVCSR consists of a hidden Markov model (HMM) based acoustic model and an N-gram based language model. The N-gram is a dominant model in LVSCR since it is easy to implement, smoothly coupled with a speech recognizer, and very effective. Unfortunately, the N-gram model cannot deal with various domains simultaneously as it is dependent on the domain of the training data. N-gram adaptation has gained popularity due to its ability to cope with the problem of the domain dependency in the N-gram model. The N-gram adaptation technique updates the characteristics of the background N-gram model into a domain specific model with little or no manually annotated adaptation corpus. The two major problems which restrict the performance of the adapted N-gram are the acquisition of the adaptation corpus and the combining method of the background and the adapted model. First, we use the language modeling approach to information retrieval (IR) to collect the adaptation corpus with an N-gram retrieval model. Recently, IR techniques have been widely used to build a training corpus for N-gram adaptation. Among the various IR techniques, the language modeling approach to IR uses the similarity between the language model of a query and the language model of a document as the distance measure. Experimental results show that the usage of bigram and trigram retrieval models instead of a unigram model improves the quality of the collected adaptation corpus. Second, a dynamic language model interpolation coefficient is proposed to solve the merging problem. The proposed interpolation coefficient varies according to the segment of the recognition hypothesis. All word hypotheses in a certain segment of the input speech were used as the vali...
Advisors
Oh, Yung-Hwanresearcher오영환researcher
Publisher
한국과학기술원
Issue Date
2006
Identifier
258163/325007  / 000995384
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학전공, 2006.8, [ x, 96 p. ]

Keywords

language model adaptation; speech recognition; dynamic weighting scheme; 동적 가중치; 언어모델; 대어휘 연속음성인식

URI
http://hdl.handle.net/10203/33213
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=258163&flag=t
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.
  • Hit : 184
  • Download : 0
  • Cited 0 times in thomson ci

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0