Tower of Babel: a crowdsourcing game building sentiment lexicons for resource-scarce languages = 바벨탑: 다국어 감정분석 지원을 위한 집단지성 게임 기반의 감정분석자원 생산 방법

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 427
  • Download : 0
With the growths of the Web 2.0 and social media, the size of data on the Web has been continuously increasing. According a Cisco statistics, the Web traffic has already reached 5,000 PB per month worldwide as of 2012, and is continuously increasing. In the realm of social media, Twitter produces more than 400 million tweet messages everyday, and Facebook generates more than 2.5 billion items everyday in 2012. These Web data mainly take text format, and are written in multiple languages. Facebook currently serves 70 languages, and 75 percent of its users live outside of the U.S. More than 60 percents of tweet messages in 2012 were written in non-English languages, and the volume of multilingual messages is rapidly increasing.Sentiment analysis is one of the key enabling technologies in the field of natural language processing. Sentiment analysis finds many useful applications in various domains such as brand management, public opinion survey, predicting stock market, etc.For English, there have been numerous publicly available resources for sentiment analysis. Sentiment analysis has gone beyond the boundary of research and started to be used commercially. However, we still lack resources to perform sentiment analysis in non-English languages because most computational linguistic research focuses on English and building high quality resources for sentiment analysis is costly. We term non-English languages resource-scarce languages in this work.We propose Tower of Babel (ToB), a crowdsourcing game building sentiment lexicons for the resource-scarce languages. ToB aims to lower the costs for building the resources for sentiment analysis by crowdsourcing and gamifying manual annotation process which has been the best yet costly and inefficient practice for building the resources. ToB builds the sentiment lexicon over other types of sentiment resources since the sentiment lexicon is the most computationally convenient and generalizable. We conducted an experiment w...
Advisors
Moon, Sue-Bokresearcher문수복
Description
한국과학기술원 : 웹사이언스공학전공,
Publisher
한국과학기술원
Issue Date
2013
Identifier
566510/325007  / 020114505
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 웹사이언스공학전공, 2013.8, [ v, 25 p. ]

Keywords

sentiment analysis; 렉시콘; 게임; 오피니언 마이닝; 다국어; 집단지성; gamification; crowdsourcing; Multilingual; opinion mining; game; lexicon; 감정분석; 게임화

URI
http://hdl.handle.net/10203/197104
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=566510&flag=dissertation
Appears in Collection
WST-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0