Sciweavers

AIRWEB
2005
Springer

Blocking Blog Spam with Language Model Disagreement

13 years 10 months ago
Blocking Blog Spam with Language Model Disagreement
We present an approach for detecting link spam common in blog comments by comparing the language models used in the blog post, the comment, and pages linked by the comments. In contrast to other link spam filtering approaches, our method requires no training, no hard-coded rule sets, and no knowledge of complete-web connectivity. Preliminary experiments with identification of typical blog spam show promising results. Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval - search engine spam; I.7.5 [Document Capture]: Document analysis - document classification, spam filtering; K.4.1 [Computers and Society]: Public Policy Issues - abuse and crime involving computers, privacy General Terms Algorithms, Languages, Legal Aspects Keywords Comment spam, language models, blogs
Gilad Mishne, David Carmel, Ronny Lempel
Added 26 Jun 2010
Updated 26 Jun 2010
Type Conference
Year 2005
Where AIRWEB
Authors Gilad Mishne, David Carmel, Ronny Lempel
Comments (0)