Some effective techniques for naive Bayes text classification

Cited 293 time in webofscience Cited 0 time in scopus
  • Hit : 395
  • Download : 1418
While naive Bayes is quite effective in various data mining tasks, it shows a disappointing result in the automatic text classification problem. Based on the observation of naive Bayes for the natural language text, we found a serious problem in the parameter estimation process, which causes poor results in text classification domain. In this paper, we propose two empirical heuristics: per-document text normalization and feature weighting method. While these are somewhat ad hoc methods, our proposed naive Bayes text classifier performs very well in the standard benchmark collections, competing with state-of-the-art text classifiers based on a highly complex learning method such as SVM.
Publisher
IEEE COMPUTER SOC
Issue Date
2006-11
Language
English
Article Type
Article
Keywords

CATEGORIZATION

Citation

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, v.18, pp.1457 - 1466

ISSN
1041-4347
URI
http://hdl.handle.net/10203/16860
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 293 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0