Automatic generation of composite labels using part-of-speech tags for parsing Korean

We propose a format of a binary phrase structure grammar with composite labels. The grammar adopts binary rules so that the dependency between two sub-trees can be represented in the label of the tree. The label of a tree is composed of two attributes, each of which is extracted from each sub-tree, so that it can represent the compositional information of the tree. The composite label is generated from partof- speech tags using an automatic labeling algorithm. Since the proposed rule description scheme is binary and uses only part-of-speech information, it can readily be used in dependency grammar and be applied to other languages as well. In the best-1 context-free cross validation on 31,080 tree-tagged corpus, the labeled precision is 79.30%, which outperforms phrase structure grammar and dependency grammar by 5% and by 4%, respectively. It shows that the proposed rule description scheme is effective for parsing Korean.
Publisher
World Scientific
Issue Date
2003-09
Language
ENG
Citation

INTERNATIONAL JOURNAL OF COMPUTER PROCESSING OF ORIENTAL LANGUAGES, v.16, no.3, pp.197 - 218

ISSN
0219-4279
URI
http://hdl.handle.net/10203/3612
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
Automatic generation of composite labels using POS tags for parsing Korean.pdf(202.92 kB)Download
  • Hit : 585
  • Download : 296
  • Cited 0 times in thomson ci

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0