DSpace at KOASAS: An evaluation of passage-based text categorization

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Journal Papers(저널논문)

An evaluation of passage-based text categorization

Cited 9 time in

Cited 0 time in

Hit : 838
Download : 70

Export

DC Field	Value	Language
dc.contributor.author	Kim, Jin Suk	ko
dc.contributor.author	Kim, Myoung Ho	ko
dc.date.accessioned	2007-11-19T01:37:11Z	-
dc.date.available	2007-11-19T01:37:11Z	-
dc.date.created	2012-02-06	-
dc.date.created	2012-02-06	-
dc.date.issued	2004-07	-
dc.identifier.citation	JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, v.23, no.1, pp.47 - 65	-
dc.identifier.issn	0925-9902	-
dc.identifier.uri	http://hdl.handle.net/10203/1983	-
dc.description.abstract	Researches in text categorization have been confined to whole-document-level classification, probably due to lack of full-text test collections. However, full-length documents available today in large quantities pose renewed interests in text classification. A document is usually written in an organized structure to present its main topic(s). This structure can be expressed as a sequence of subtopic text blocks, or passages. In order to reflect the subtopic structure of a document, we propose a new passage-level or passage-based text categorization model, which segments a test document into several passages, assigns categories to each passage, and merges the passage categories to the document categories. Compared with traditional document-level categorization, two additional steps, passage splitting and category merging, are required in this model. Using four subsets of the Reuters text categorization test collection and a full-text test collection of which documents are varying from tens of kilobytes to hundreds, we evaluate the proposed model, especially the effectiveness of various passage types and the importance of passage location in category merging. Our results show simple windows are best for all test collections tested in these experiments. We also found that passages have different degrees of contribution to the main topic(s), depending on their location in the test document.	-
dc.description.sponsorship	We would like to thank Wonkyun Joo for some helpful comments and fruitful discussions, Hwa-muk Yoon for providing raw data for the KISTI-Theses test collection, and Changmin Kim and Jieun Chong for supporting this work.	en
dc.language	English	-
dc.language.iso	en_US	en
dc.publisher	SPRINGER	-
dc.subject	RANKING	-
dc.title	An evaluation of passage-based text categorization	-
dc.type	Article	-
dc.identifier.wosid	000221745000003	-
dc.identifier.scopusid	2-s2.0-3042796461	-
dc.type.rims	ART	-
dc.citation.volume	23	-
dc.citation.issue	1	-
dc.citation.beginningpage	47	-
dc.citation.endingpage	65	-
dc.citation.publicationname	JOURNAL OF INTELLIGENT INFORMATION SYSTEMS	-
dc.identifier.doi	10.1023/B:JIIS.0000029670.53363.d0	-
dc.embargo.liftdate	9999-12-31	-
dc.embargo.terms	9999-12-31	-
dc.contributor.localauthor	Kim, Myoung Ho	-
dc.contributor.nonIdAuthor	Kim, Jin Suk	-
dc.type.journalArticle	Article	-
dc.subject.keywordAuthor	text categorization	-
dc.subject.keywordAuthor	passage	-
dc.subject.keywordAuthor	non-overlapping window	-
dc.subject.keywordAuthor	overlapping window	-
dc.subject.keywordAuthor	paragraph	-
dc.subject.keywordAuthor	bounded-paragraph	-
dc.subject.keywordAuthor	page	-
dc.subject.keywordAuthor	TextTile	-
dc.subject.keywordAuthor	passage weight function	-
dc.subject.keywordPlus	RANKING	-

Appears in Collection: CS-Journal Papers(저널논문)

Files in This Item

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 9 items in WoS	Click to see citing articles in

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

An evaluation of passage-based text categorization

This item is cited by other documents in WoS

KOASAS

Communities & Collections