Probabilistic Topic Modeling for Comparative Analysis of Document Collections

Cited 13 time in webofscience Cited 7 time in scopus
  • Hit : 309
  • Download : 0
Probabilistic topic models, which can discover hidden patterns in documents, have been extensively studied. However, rather than learning from a single document collection, numerous real-world applications demand a comprehensive understanding of the relationships among various document sets. To address such needs, this article proposes a new model that can identify the common and discriminative aspects of multiple datasets. Specifically, our proposed method is a Bayesian approach that represents each document as a combination of common topics (shared across all document sets) and distinctive topics (distributions over words that are exclusive to a particular dataset). Through extensive experiments, we demonstrate the effectiveness of our method compared with state-of-the-art models. The proposedmodel can be useful for "comparative thinking" analysis in real-world document collections.
Publisher
ASSOC COMPUTING MACHINERY
Issue Date
2020-03
Language
English
Article Type
Article
Citation

ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, v.14, no.2, pp.24

ISSN
1556-4681
DOI
10.1145/3369873
URI
http://hdl.handle.net/10203/279465
Appears in Collection
AI-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 13 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0