Sciweavers

WWW
2008
ACM

Detecting image spam using visual features and near duplicate detection

14 years 9 months ago
Detecting image spam using visual features and near duplicate detection
Email spam is a much studied topic, but even though current email spam detecting software has been gaining a competitive edge against text based email spam, new advances in spam generation have posed a new challenge: image-based spam. Image based spam is email which includes embedded images containing the spam messages, but in binary format. In this paper, we study the characteristics of image spam to propose two solutions for detecting image-based spam, while drawing a comparison with the existing techniques. The first solution, which uses the visual features for classification, offers an accuracy of about 98%, i.e. an improvement of at least 6% compared to existing solutions. SVMs (Support Vector Machines) are used to train classifiers using judiciously decided color, texture and shape features. The second solution offers a novel approach for near duplication detection in images. It involves clustering of image GMMs (Gaussian Mixture Models) based on the Agglomerative Information Bo...
Bhaskar Mehta, Saurabh Nangia, Manish Gupta 0002,
Added 21 Nov 2009
Updated 21 Nov 2009
Type Conference
Year 2008
Where WWW
Authors Bhaskar Mehta, Saurabh Nangia, Manish Gupta 0002, Wolfgang Nejdl
Comments (0)