DSpace at KOASAS: Domain adaptation in sentiment classification based on probabilistic models

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Master(석사논문)

Domain adaptation in sentiment classification based on probabilistic models확률 모델에 기반한 의견 분류에서의 도메인 적응

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 457
Download : 0

Export

DC Field	Value	Language
dc.contributor.advisor	Lee, Soo-Young	-
dc.contributor.advisor	이수영	-
dc.contributor.author	Lee, Cheong-An	-
dc.contributor.author	이청안	-
dc.date.accessioned	2013-09-12T02:01:38Z	-
dc.date.available	2013-09-12T02:01:38Z	-
dc.date.issued	2013	-
dc.identifier.uri	http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=513315&flag=dissertation	-
dc.identifier.uri	http://hdl.handle.net/10203/180995	-
dc.description	학위논문(석사) - 한국과학기술원 : 전기및전자공학과, 2013.2, [ v, 57 p. ]	-
dc.description.abstract	Sentiment classification is a task to determine overall contextual polarity of a review document. Sentiment classification can be used for a company to check the problem of their products or services from the large data. It also can be used for customer to decide the products or services they would consume. There are two main difficulties dealing with sentiment classification. First, the documents are usually represented as a bag-of-words model and the dimension of such document data is very large, so we need methods to extract or reduce the number of dimension. Secondly, if the domain is different for training data and testing data, the performance decreased severely. However, it is hard to get the labeled data for the all the domain we are interested in. To extract or reduce the dimension, we tried three methods: principal component analysis (PCA), conditional entropy (CE), and independent component analysis (ICA). We can reduce the dimension using PCA without any loss of information. By changing the estimation of probability a little bit, we are able to achieve more balanced estimation of CE, which gives robust recognition through different number of features we selected. ICA can make the features independent, so that it was expected to give better result when we used it with CE. However, experiments suggest that ICA is not useful for CE. To resolve the problem of domain difference, we propose domain adapting Boltzmann machine algorithm. The big difference between domains comes from the word dictionary used for each domain. So we take the approach to generate target domain words that are not appearing in source domain, and vice versa. In this thesis, we first applied this idea to simple toy problem and then real world problem. We improved the classification accuracy using our algorithm.	eng
dc.language	eng	-
dc.publisher	한국과학기술원	-
dc.subject	sentiment classification	-
dc.subject	domain adaptation	-
dc.subject	Boltzmann machine	-
dc.subject	conditional entropy	-
dc.subject	의견 분류	-
dc.subject	도메인 적응	-
dc.subject	볼츠만 머신	-
dc.subject	조건부 엔트로피	-
dc.subject	독립 요소 분석	-
dc.subject	independent component analysis	-
dc.title	Domain adaptation in sentiment classification based on probabilistic models	-
dc.title.alternative	확률 모델에 기반한 의견 분류에서의 도메인 적응	-
dc.type	Thesis(Master)	-
dc.identifier.CNRN	513315/325007	-
dc.description.department	한국과학기술원 : 전기및전자공학과,	-
dc.identifier.uid	020113491	-
dc.contributor.localauthor	Lee, Soo-Young	-
dc.contributor.localauthor	이수영	-

Appears in Collection: EE-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Domain adaptation in sentiment classification based on probabilistic models확률 모델에 기반한 의견 분류에서의 도메인 적응

KOASAS

Communities & Collections