DSpace at KOASAS: Rule identification from Web pages by the XRML approach

DSpace at KOASAS

College of Business(경영대학)Graduate School of Information and Media Management((구)정보미디어경영대학원)KSIM-Journal Papers(저널논문)

Rule identification from Web pages by the XRML approach

Cited 17 time in

Cited 18 time in

Hit : 1383
Download : 409

Export

DC Field	Value	Language
dc.contributor.author	Kang, J	ko
dc.contributor.author	Lee, Jae Kyu	ko
dc.date.accessioned	2008-04-30T09:49:33Z	-
dc.date.available	2008-04-30T09:49:33Z	-
dc.date.created	2012-02-06	-
dc.date.created	2012-02-06	-
dc.date.issued	2005-11	-
dc.identifier.citation	DECISION SUPPORT SYSTEMS, v.41, no.1, pp.205 - 227	-
dc.identifier.issn	0167-9236	-
dc.identifier.uri	http://hdl.handle.net/10203/4304	-
dc.description.abstract	In the world of Web pages, there are oceans of documents in natural language texts and tables. To extract rules from Web pages and maintain consistency between them, we have developed the framework of XRML (eXtensible Rule Markup Language). XRML allows the identification of rules on Web pages and generates the identified rules automatically. For this purpose, we have designed the Rule Identification Markup Language (RIML), which is similar to the formal Rule Structure Mark-tip Language (RSML), both as parts of XRML. RIML 2.0 is designed to identify rules not only from texts, but also from tables on Web pages, and to transform to the formal rules in RSML syntax automatically. While designing RIML 2.0, we considered the features of sharing variables and values, omitted terms, and synonyms. We have conducted an experiment to evaluate the potential benefit of the XRML approach with real world Web pages of Amazon.com, BarnesandNoble.com, and Powells.com. We found that 100.0% of the rules and 99.7% of the rule components could be identified and automatically generated if we do not count the statements for linkages, which generically do not exist on the Web pages. Since the linkage components occupy 11.2% of all components in the rule base, the overall limitation of automatic rule generation is 88.8%. In this setting, 88.5% of the overall rule components could be generated from the identified rules from the Web pages. The result provides solid proof that XRML can facilitate the extraction and maintenance of rules from Web pages while building expert systems in the Semantic Web environment. (c) 2005 Elsevier B.V All rights reserved.	-
dc.language	English	-
dc.language.iso	en_US	en
dc.publisher	ELSEVIER SCIENCE BV	-
dc.subject	KNOWLEDGE ACQUISITION	-
dc.subject	INFORMATION EXTRACTION	-
dc.subject	NATURAL-LANGUAGE	-
dc.subject	MARKUP LANGUAGE	-
dc.subject	TEXT	-
dc.subject	ONTOLOGIES	-
dc.subject	DOCUMENTS	-
dc.subject	SUPPORT	-
dc.subject	SYSTEM	-
dc.title	Rule identification from Web pages by the XRML approach	-
dc.type	Article	-
dc.identifier.wosid	000232712000012	-
dc.identifier.scopusid	2-s2.0-25444513368	-
dc.type.rims	ART	-
dc.citation.volume	41	-
dc.citation.issue	1	-
dc.citation.beginningpage	205	-
dc.citation.endingpage	227	-
dc.citation.publicationname	DECISION SUPPORT SYSTEMS	-
dc.identifier.doi	10.1016/j.dss.2005.01.004	-
dc.embargo.liftdate	9999-12-31	-
dc.embargo.terms	9999-12-31	-
dc.contributor.localauthor	Lee, Jae Kyu	-
dc.contributor.nonIdAuthor	Kang, J	-
dc.type.journalArticle	Article	-
dc.subject.keywordAuthor	rule identification	-
dc.subject.keywordAuthor	rule acquisition	-
dc.subject.keywordAuthor	knowledge engineering	-
dc.subject.keywordAuthor	knowledge acquisition	-
dc.subject.keywordAuthor	XRML	-
dc.subject.keywordAuthor	RuleML	-
dc.subject.keywordAuthor	XML	-
dc.subject.keywordPlus	KNOWLEDGE ACQUISITION	-
dc.subject.keywordPlus	INFORMATION EXTRACTION	-
dc.subject.keywordPlus	NATURAL-LANGUAGE	-
dc.subject.keywordPlus	MARKUP LANGUAGE	-
dc.subject.keywordPlus	TEXT	-
dc.subject.keywordPlus	ONTOLOGIES	-
dc.subject.keywordPlus	DOCUMENTS	-
dc.subject.keywordPlus	SUPPORT	-
dc.subject.keywordPlus	SYSTEM	-

Appears in Collection: MT-Journal Papers(저널논문)

Files in This Item

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 17 items in WoS	Click to see citing articles in

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Rule identification from Web pages by the XRML approach

This item is cited by other documents in WoS

KOASAS

Communities & Collections