Utterance verification using word-dependent thresholds based on probabilistic distributions of phone-level log-likelihood ratio

Cited 0 time in webofscience Cited 3 time in scopus
  • Hit : 444
  • Download : 12
This paper suggests word voiceprint models to verify the recognition results obtained from a speech recognition system. Word voiceprint models have word-dependent information based on the distributions of phone-level log-likelihood ratio and duration. Thus, we can obtain a more reliable confidence score for a recognized word by using its word voiceprint models that represent the more proper characteristics of utterance verification for the word. Additionally, when obtaining a log-likelihood ratio-based word voiceprint score, this paper proposes a new log-scale normalization function using the distribution of the phone-level log-likelihood ratio, instead of the sigmoid function widely used in obtaining a phone-level log-likelihood ratio. This function plays a role of emphasizing a mis-recognized phone in a word. This individual information of a word is used to help achieve a more discriminative score against out-of vocabulary words. The proposed method requires additional memory, but it shows that the relative reduction in equal error rate is 16.9% compared to the baseline system using simple phone log-likelihood ratios.
Publisher
IEICE-INST ELECTRONICS INFORMATION COMMUNICATIONS ENG
Issue Date
2008-11
Language
English
Article Type
Letter
Keywords

CONFIDENCE MEASURES; SPEECH RECOGNITION

Citation

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, v.E91D, no.11, pp.2746 - 2750

ISSN
0916-8532
DOI
10.1093/ietisy/e91-d.11.2746
URI
http://hdl.handle.net/10203/23099
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0