Co-burst based topical word extraction for text summarization and search텍스트 요약 및 검색을 위한 코-버스트 기반 토픽 단어 추출

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 202
  • Download : 0
The rapid increase of electronical text necessitates reliable automatic text analysis that supports human knowledge tasks. Among the most promising topics are document summarization and topical search for desired documents. This research suggests considering the writer’s cognitive process of communication, formed as a sequence of topics. A topical word set in a part of a document may indicate the schemas that compose the subject in that point of discourse. If the topical word sets can be extracted, they can be used to improve both text summarization and document search. The co-burst based topical word extraction method is proposed to find the schematic terms. Burst analysis detects where a word is more active or bursty than in the other parts. A set of words having bursts together, or co-burst, may represent the topic schemas in that part. The proposed method is implemented and applied first to the single-document summarization. A knowledge-based approach using a knowledge base is used as a complementary method. The result shows that the new approach outperforms the current state-of-the-art summarization, verifying that using the schema terms found by the co-burst detection method has a great effect. The approach is also applied to multi-document summarization. The problem is formulated as a multi-objective optimization with two objective functions: coverage and diversity. For the coverage objective function, k-means clustering, knowledge-based, and co-burst-based topical coverage functions are examined. The results are again better than the conventional methods and show greater robustness in realistic situations where the word sequence and order varied. Since a topic schema can be represented by a word set, extracting topics will benefit text-based works such as searching for documents or sections of topical interest. The idea is presented by an exemplar topical search system that visualizes the topics and allows the user to interactively find relevant documents.
Advisors
Yoon, Wan Chulresearcher윤완철researcher
Description
한국과학기술원 :지식서비스공학대학원,
Publisher
한국과학기술원
Issue Date
2020
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 지식서비스공학대학원, 2020.2,[v, 91 p. :]

Keywords

burst analysis▼aautomatic text summarization▼aevolutionary multi-objective optimization▼ainteractive text search▼atopic schema; 버스트 분석▼a자동 문서 요약▼a다중목적함수 진화 최적화 알고리즘▼a인터렉티브 텍스트 검색▼a토픽 스키마

URI
http://hdl.handle.net/10203/295759
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=959238&flag=dissertation
Appears in Collection
KSE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0