A study on measurement of similarity for interlinking chinese, japanese and korean resources한중일 언어자원 연결을 위한 유사도 측정 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 592
  • Download : 0
Linked Open Data (LOD) is an international endeavor to interlink structured data on the Web and create the Web of Data on a global level. Linking data can be achieved by understanding semantic relationships between data and building explicit links for them. One serious challenge that deters this worldwide initiative is the issue of multilinguality. The current LOD provides limited support for non-Western data, in particular for Asian data. In this study, we propose a novel method with which Chinese, Japanese, and Korean (CJK) resources can be better matched and connected. The three countries share Chinese characters even though Japan and Korea have their own language. Utilizing the Unihan database, which covers more than 45,000 characters commonly used for the three countries, we show that the proposed method outperforms the traditional method based on string matching in finding similar characters and words among the three countries. The results represent the first step towards overcoming the multilingual barrier in semantically interlinking LOD resources across the three countries.
Advisors
Yi, Mun-Yongresearcher이문용
Description
한국과학기술원 : 지식서비스공학과,
Publisher
한국과학기술원
Issue Date
2014
Identifier
569588/325007  / 020123595
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 지식서비스공학과, 2014.2, [ vi, 76 p. ]

Keywords

LOD; 유니한 데이터베이스; 한중일 언어자원; 언어 유사도 측정; 링크드 오픈 데이터; Multiliguality; Similarity distance measure; CJK; Unihan database

URI
http://hdl.handle.net/10203/197094
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=569588&flag=dissertation
Appears in Collection
IE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0