Language Model Adaptation Based on Topic Probability of Latent Dirichlet Allocation

Cited 10 time in webofscience Cited 0 time in scopus
  • Hit : 715
  • Download : 780
Two new methods are proposed for an unsupervised adaptation of a language model (LM) with a single sentence for automatic transcription tasks. At the training phase, training documents are clustered by a method known as Latent Dirichlet allocation (LDA), and then a domain-specific LM is trained for each cluster. At the test phase, an adapted LM is presented as a linear mixture of the now trained domain-specific LMs. Unlike previous adaptation methods, the proposed methods fully utilize a trained LDA model for the estimation of weight values, which are then to be assigned to the now trained domain-specific LMs; therefore, the clustering and weight-stimation algorithms of the trained LDA model are reliable. For the continuous speech recognition benchmark tests, the proposed methods outperform other unsupervised LM adaptation methods based on latent semantic analysis, non-negative matrix factorization, and LDA with n-gram counting.
Publisher
ELECTRONICS TELECOMMUNICATIONS RESEARCH INST
Issue Date
2016-06
Language
English
Article Type
Article
Citation

ETRI JOURNAL, v.38, no.3, pp.487 - 493

ISSN
1225-6463
DOI
10.4218/etrij.16.0115.0499
URI
http://hdl.handle.net/10203/209828
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
95655.pdf(815.07 kB)Download
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 10 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0