This paper suggests word voiceprint models
and word-dependent thresholds using distributions of
phone-level log-likelihood ratio and duration to verify the
recognition results obtained from a speech recognition
system. Word voiceprint models have word-dependent
information based on the distributions of phone-level
log-likelihood ratio and duration. Thus, we can obtain a
more reliable confidence score for a recognized word by
using its word voiceprint models that represent the more
proper characteristics of utterance verification for the
word. There are many conditions to affect the decision of
thresholds in utterance verification system. In this paper,
we propose an algorithm to generate the threshold for each
word using distributions of phone-level log-likelihood ratio.
For each word, confidence measure obtained from
phone-level log-likelihood ratios has different distribution,
so we need to adapt the different threshold for recognized
word. The algorithm using word voiceprint models shows
that relative reduction in equal error rate is 16.9%
compared to the baseline system using simple phone
log-likelihood ratios. And the word-dependent thresholding
method shows that the relative reduction in equal error rate
is 14.6% compared to the baseline system using one global
threshold.
Issue Date
2011-03-29
Keywords
Confidence Measure; Utterance Verification; Word voiceprint models; Word-dependent thresholds