DSpace at KOASAS: Patent document categorization based on semantic structural information

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Journal Papers(저널논문)

Patent document categorization based on semantic structural information

Cited 34 time in

Cited 0 time in

Hit : 1398
Download : 85

Export

Kim, JH / Choi, Key-Sun researcher

The number of patent documents is currently rising rapidly worldwide, creating the need for an automatic categorization system to replace time-consuming and labor-intensive manual categorization. Because accurate patent classification is crucial to search for relevant existing patents in a certain field, patent categorization is a very important and useful field. As patent documents are structural documents with their own characteristics distinguished from general documents, these unique traits should be considered in the patent categorization process. In this paper, we categorize Japanese patent documents automatically, focusing on their characteristics: patents are structured by claims, purposes, effects, embodiments of the invention, and so on. We propose a patent document categorization method that uses the k-NN (k-Nearest Neighbour) approach. In order to retrieve similar documents from a training document set, some specific components to denote the socalled semantic elements, such as claim, purpose, and application field, are compared instead of the whole texts. Because those specific components are identified by various user-defined tags, first all of the components are clustered into several semantic elements. Such semantically clustered structural components are the basic features of patent categorization. We can achieve a 74% improvement of categorization performance over a baseline system that does not use the structural information of the patent. (c) 2007 Published by Elsevier Ltd.

Publisher: PERGAMON-ELSEVIER SCIENCE LTD

Issue Date: 2007-09

Language: English

Article Type: Article

Description: Received 1 September 2005; accepted 29 May 2006; Available online 26 March 2007

Citation: INFORMATION PROCESSING & MANAGEMENT, v.43, no.5, pp.1200 - 1215

ISSN: 0306-4573

DOI: 10.1016/j.ipm.2007.02.002

URI: http://hdl.handle.net/10203/3537

Appears in Collection: CS-Journal Papers(저널논문)

Files in This Item

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 34 items in WoS	Click to see citing articles in

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Patent document categorization based on semantic structural information

This item is cited by other documents in WoS

KOASAS

Communities & Collections