Sciweavers

CEAS
2011
Springer

Spam detection using web page content: a new battleground

12 years 3 months ago
Spam detection using web page content: a new battleground
Traditional content-based e-mail spam filtering takes into account content of e-mail messages and apply machine learning techniques to infer patterns that discriminate spams from hams. In particular, the use of content-based spam filtering unleashed an unending arms race between spammers and filter developers, given the spammers’ ability to continuously change spam message content in ways that might circumvent the current filters. In this paper, we propose to expand the horizons of content-based filters by taking into consideration the content of the Web pages linked by e-mail messages. We describe a methodology for extracting pages linked by URLs in spam messages and we characterize the relationship between those pages and the messages. We then use a machine learning technique (a lazy associative classifier) to extract classification rules from the web pages that are relevant to spam detection. We demonstrate that the use of information from linked pages can nicely complemen...
Marco Túlio Ribeiro, Pedro Henrique Calais
Added 13 Dec 2011
Updated 13 Dec 2011
Type Journal
Year 2011
Where CEAS
Authors Marco Túlio Ribeiro, Pedro Henrique Calais Guerra, Leonardo Vilela, Adriano Veloso, Dorgival Guedes, Wagner Meira Jr., Marcelo H. P. C. Chaves, Klaus Steding-Jessen, Cristine Hoepers
Comments (0)