Extended author-topic modeling for separating topics in problem and solution perspectives문제 및 해결책 관점의 주제 분리를 위한 확장된 저자-주제 모델링

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 570
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorChoi, Ho-Jin-
dc.contributor.advisor최호진-
dc.contributor.authorHer, Yun-
dc.contributor.author허윤-
dc.date.accessioned2013-09-12T01:50:40Z-
dc.date.available2013-09-12T01:50:40Z-
dc.date.issued2012-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=487448&flag=dissertation-
dc.identifier.urihttp://hdl.handle.net/10203/180528-
dc.description학위논문(석사) - 한국과학기술원 : 전산학과, 2012.2, [ ii, 34 p. ]-
dc.description.abstractExtended Author-Topic Model (EATM) to extract more specific author information in problem and solution perspectives is proposed. EATM is an extended model based on existing approach, Author-Topic Model(ATM) which proposed an unsupervised learning technique for extracting information about authors and topics from large text collections. Especially, the proposed model is designed to extract topic distribution of authors in prob-lem and solution perspectives. In research papers, there are two subject matters commonly understood, problem and solution. Problem and solution refer to a key objective to achieve in a research work and a sort of techniques with which authors solve the problem, respectively. EATM represents documents as if they were generated by a sort of stochastic process. An author is represented by a probability distribution over topics and perspectives. A topic is represented as a probability distribution over words. A perspective is also represented as a probability distribution over words, but the stochastic process is controlled in initial stage. The topic-word and author-topic distributions are learned from data in an unsupervised manner using a Markov chain Monte Carlo algorithm. To achieve the goal of our research, we address two technical challenges. First, topic assignment boundary is changed from sentence to phrase. It relies on dataset and perspective. Second, the preprocessing step using as-signed phrase extension is needed to a richer number of initialized phrases. We apply the proposed model to text collections which include research papers from four of major conferences in computer science. We discuss the interpretation of the results discovered by the model with specific topics and authors and give a prominence to the results discovered by both perspectives. We show the different ranking of authors discovered in each per-spective and illustrate reviewer recommendation as an applications to emphasize differences between the author-topi...eng
dc.languageeng-
dc.publisher한국과학기술원-
dc.subject주제 모델-
dc.subject저자-주제 모델-
dc.subjectTopic model-
dc.subjectauthor-topic model-
dc.subjectunsupervised learning-
dc.subject비감독 학습-
dc.titleExtended author-topic modeling for separating topics in problem and solution perspectives-
dc.title.alternative문제 및 해결책 관점의 주제 분리를 위한 확장된 저자-주제 모델링-
dc.typeThesis(Master)-
dc.identifier.CNRN487448/325007 -
dc.description.department한국과학기술원 : 전산학과, -
dc.identifier.uid020094099-
dc.contributor.localauthorChoi, Ho-Jin-
dc.contributor.localauthor최호진-
Appears in Collection
CS-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0