Effective foreign word extraction for Korean information retrieval

Cited 10 time in webofscience Cited 11 time in scopus
  • Hit : 350
  • Download : 0
In Korean text, foreign words, which are mostly transliterations of English words, are frequently used. Foreign words are usually very important index terms in Korean information retrieval since most of them are technical terms or names. So accurate foreign word extraction is crucial for high performance of information retrieval. However, accurate foreign word extraction is not easy because it inevitably accompanies word segmentation and most of the foreign words are unknown. In this paper, we present an effective foreign word recognition and extraction method. In order to accurately extract foreign words, we developed an effective method of word segmentation that involves unknown foreign words. Our word segmentation method effectively utilizes both unknown word information acquired through the automatic dictionary compilation and foreign word recognition information. Our HMM-based foreign word recognition method does not require large labeled examples for the model training unlike the previously proposed method. (C) 2001 Elsevier Science Ltd. All rights reserved.
Publisher
PERGAMON-ELSEVIER SCIENCE LTD
Issue Date
2002-01
Language
English
Article Type
Article
Citation

INFORMATION PROCESSING MANAGEMENT, v.38, no.1, pp.91 - 109

ISSN
0306-4573
DOI
10.1016/S0306-4573(00)00065-0
URI
http://hdl.handle.net/10203/85164
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 10 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0