Extracting partial parsing rules from tree-annotated corpus: Toward deterministic global parsing

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 409
  • Download : 0
It is not always possible to find a global parse for an input sentence owing to problems such as errors of a sentence, incompleteness of lexicon and grammar. Partial parsing is an alternative approach to respond to these problems. Partial parsing techniques try to recover syntactic information efficiently and reliably by sacrificing completeness and depth of analysis. One of the difficulties in partial parsing is how the grammar might be automatically extracted. In this paper we present a method of automatically extracting partial parsing rules from a tree-annotated corpus using the decision tree method. Our goal is deterministic global parsing using partial parsing rules, in other words, to extract partial parsing rules with higher accuracy and broader expansion. First, we define a rule template that enables to learn a subtree for a given substring, so that the resultant rules can be more specific and stricter to apply. Second, rule candidates extracted from a training corpus are enriched with contextual and lexical information using the decision tree method and verified through cross-validation. Last, we underspecify non-deterministic rules by merging substructures with ambiguity in those rules. The learned grammar is similar to phrase structure grammar with contextual and lexical information, but allows building structures of depth one or more. Thanks to automatic learning, the partial parsing rules can be consistent and domain-independent. Partial parsing with this grammar processes an input sentence deterministically using longest-match heuristics, and recursively applies rules to an input sentence. The experiments showed that the partial parser using automatically extracted rules is not only accurate and efficient but also achieves reasonable coverage for Korean.
Publisher
IEICE-INST ELECTRONICS INFORMATION COMMUNICATIONS ENG
Issue Date
2005-06
Language
English
Article Type
Article
Citation

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, v.E88D, no.6, pp.1248 - 1255

ISSN
0916-8532
DOI
10.1093/ietisy/e88-d.6.1248
URI
http://hdl.handle.net/10203/92528
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0