On Using SVM and Kolmogorov Complexity for Spam Filtering

11 years 2 months ago
On Using SVM and Kolmogorov Complexity for Spam Filtering
As a side effect of e-marketing strategy the number of spam e-mails is rocketing, the time and cost needed to deal with spam as well. Spam filtering is one of the most difficult tasks among diverse kinds of text categorization, sad consequence of spammers dynamic efforts to escape filtering. In this paper, we investigate the use of Kolmogorov complexity theory as a backbone for spam filtering, avoiding the burden of text analysis, keywords and blacklists update. Exploiting the fact that we can estimate a message information content through compression techniques, we represent an e-mail as a multidimensional real vector and then we implement a support vector machine classifier to classify new incoming e-mails. The first results we get exhibit interesting accuracy rates and emphasize the relevance of our idea.
Sihem Belabbes, Gilles Richard
Added 02 Oct 2010
Updated 02 Oct 2010
Type Conference
Year 2008
Authors Sihem Belabbes, Gilles Richard
Comments (0)