Restricted Representation of Phrase Structure Grammar for Building a Tree Annotated Corpus of Korean

Cited 0 time in webofscience Cited 6 time in scopus
  • Hit : 324
  • Download : 0
In this paper, we introduce a method to represent phrase structure grammars for building a large annotated corpus of Korean syntactic trees. Korean is different from English in word order and word compositions. As a result of our study, it turned out that the differences are significant enough to induce meaningful changes in the tree annotation scheme for Korean with respect to the schemes for English. A tree annotation scheme defines the grammar formalism to be assumed, categories to be used, and rules to determine correct parses for unsettled issues in parse construction. Korean is partially free in word order and the essential components such as subjects and objects of a sentence can be omitted with greater freedom than in English. We propose a restricted representation of phrase structure grammar to handle the characteristics of Korean more efficiently. The proposed representation is shown by means of an extensive experiment to gain improvements in parsing time as well as grammar size. We also describe the system named Teb that is a software environment set up with a goal to build a tree annotated corpus of Korean containing more than one million units. © 1997, Cambridge University Press. All rights reserved.
Publisher
CAMBRIDGE UNIV PRESS
Issue Date
1997
Language
English
Citation

NATURAL LANGUAGE ENGINEERING, v.3, no.2, pp.215 - 230

ISSN
1351-3249
DOI
10.1017/S1351324997001782
URI
http://hdl.handle.net/10203/71766
Appears in Collection
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0