Machine transliteration has received significant attention as
a supporting tool for machine translation and cross-language information
retrieval. During the last decade, four kinds of transliteration model
have been studied — grapheme-based model, phoneme-based model, hybrid
model, and correspondence-based model. These models are classified
in terms of the information sources for transliteration or the units
to be transliterated — source graphemes, source phonemes, both source
graphemes and source phonemes, and the correspondence between source
graphemes and phonemes, respectively. Although each transliteration
model has shown relatively good performance, one model alone has limitations
on handling complex transliteration behaviors. To address the
problem, we combined different transliteration models with a “generating
transliterations followed by their validation” strategy. The strategy
makes it possible to consider complex transliteration behaviors using
the strengths of each model and to improve transliteration performance
by validating transliterations. Our method makes use of web-based and
transliteration model-based validation for transliteration validation. Experiments
showed that our method outperforms both the individual
transliteration models and previous work.