Stochastic modeling of Korean language for Open-vocabulary handwritten Hangul recognition개방 어휘 환경에서의 필기체 한글 인식을 위한 한국어의 통계적 모델링

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 497
  • Download : 0
The stochastic language model provides linguistic likelihoods of expressions that can be used in resolving ambiguities in geometric evidences. This can be especially beneficial in handwritten Hangul recognition, because Hangul characters have highly similar shapes due to the compositional nature. For modeling of Korean language, morpheme-based models have been preferred due to her agglutinative characteristics. These models basically assume that the input texts are syntactically well-formed. However, this assumption holds only in limited domains. Also, they demand morphologically analyzed corpus for training, which are expensive to use because the corpus needs to be processed by human experts. We present a novel language model that can be trained from a raw corpus. Without relying on the linguistic knowledge, we train the lexicon and their associated probabilities out of raw texts based on statistical measures. Experiments show that the proposed model effectively captures the variable-length regularities in Korean language even though no linguistic knowledge was utilized explicitly during training. In recognition experiments, both the character recognition rates and the word recognition rates show significant improvment by employing the proposed language model.
Advisors
Kim, Jin-Hyungresearcher김진형researcher
Description
한국과학기술원 : 전산학전공,
Publisher
한국과학기술원
Issue Date
2005
Identifier
244972/325007  / 000995124
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학전공, 2005.2, [ vii, 64 p. ]

Keywords

Stochastic language modeling; Handwritten Hangul recognition; Pattern recognition; 필기체 한글 인식; 통계적 언어 모델링; 패턴 인식

URI
http://hdl.handle.net/10203/32889
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=244972&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0