Speech recognition technique is expected to make a great impact on
many user interface areas such as toys, mobile phones, PDAs, and home appliances.
Those applications basically require robust speech recognition immune
to environment and channel noises, but the dialogue pattern used in the interaction
with the devices may be relatively simple, that is, an isolated-word type.
The drawback of small-vocabulary isolated-word recognizer which is generally
used in the applications is that, if target vocabulary needs to be changed, acoustic
models should be re-trained for high performance. However, if a phone
model-based speech recognition is used with reliable unseen model prediction,
we do not need to re-train acoustic models in getting higher performance. In
this paper, we propose a few reliable methods for unseen model prediction in
flexible vocabulary speech recognition. The first method gives optimal threshold
values for stop criteria in decision tree growing, and the second uses an
additional condition in the question selection in order to overcome the overbalancing
phenomenon in the conventional method. The last proposes twostage
decision trees which in the first stage get more properly trained models
and in the second stage build more reliable unseen models. Various vocabularyindependent
situations were examined in order to clearly show the effectiveness
of the proposed methods. In the experiments, the average word error rates of
the proposed methods were reduced by 32.8%, 41.4%, and 44.1% compared to
the conventional method, respectively. From the results, we can conclude that
the proposed methods are very effective in the unseen model prediction for vocabulary-
independent speech recognition.