DSpace at KOASAS: An Approach to Spam Comment Detection through Domain-independent Features

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Conference Papers(학술회의논문)

An Approach to Spam Comment Detection through Domain-independent Features

Cited 5 time in

Cited 0 time in

Hit : 292
Download : 161

Export

Kim, Jong Myoung / Kim, Zae Myung / Kim, Kwangjo researcher

Previous research in spam detection, especially in email spam filtering, mainly focused on learning a set of discriminative features that are often present in the spam contents. Nowadays, these commercially oriented spams are well detected; the real challenge lies in filtering rather vague spams that do not exhibit distinctive spam keywords. We investigate two ways of detecting such spams: 1) By comparing the similarity between the publisher posts and user comments, and 2) by learning a single representative meta-feature such as user name or ID. The first measure relieves us from repetitively learning a set of domain-dependent spam features, and the second measure enables us to detect potential spam users even before the aggressive actions are performed. Prior to the language model comparison in the first method, we supplement the background information, normalize the text, perform co-reference resolution, and conduct word-to-word similarity measure in hope of enriching the language models to improve the classification accuracy. To evaluate the first measure, experiments on detecting blog-spam comments are conducted. As for the second measure, we employ SVM on the ID space of e-mail data collected by "Apache Spam Assassin".

Publisher: Korean Institute of Information Scientists and Engineers (KIISE)

Issue Date: 2016-01-19

Language: English

Citation: 2016 International Conference on Big Data and Smart Computing (BigComp2016), pp.273 - 276

DOI: 10.1109/BIGCOMP.2016.7425926

URI: http://hdl.handle.net/10203/214365

Appears in Collection: CS-Conference Papers(학술회의논문)

Files in This Item

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 5 items in WoS	Click to see citing articles in

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

An Approach to Spam Comment Detection through Domain-independent Features

This item is cited by other documents in WoS

KOASAS

Communities & Collections