Sciweavers

AIRWEB
2008
Springer

Exploring linguistic features for web spam detection: a preliminary study

14 years 1 months ago
Exploring linguistic features for web spam detection: a preliminary study
We study the usability of linguistic features in the Web spam classification task. The features were computed on two Web spam corpora: Webspam-Uk2006 and Webspam-Uk2007, we make them publicly available for other researchers. Preliminary analysis seems to indicate that certain linguistic features may be useful for the spam-detection task when combined with features studied elsewhere. Categories and Subject Descriptors H.3.1 [Information Storage and Retrieval]: Content Analysis and Indexing--Linguistic processing; I.2.6 [Artificial Intelligence]: Learning General Terms Web spam Keywords Web spam detection, content features, linguistic features
Jakub Piskorski, Marcin Sydow, Dawid Weiss
Added 12 Oct 2010
Updated 12 Oct 2010
Type Conference
Year 2008
Where AIRWEB
Authors Jakub Piskorski, Marcin Sydow, Dawid Weiss
Comments (0)