Combining Global and Personal Anti-Spam Filtering

10 years 4 months ago
Combining Global and Personal Anti-Spam Filtering
Many of the first successful applications of statistical learning to anti-spam filtering were personalized classifiers that were trained on an individual user’s spam and ham e-mail. Proponents of personalized filters argue that statistical text learning is effective because it can identify the unique aspects of each individual’s e-mail. On the other hand, a single classifier learned for a large population of users can leverage the data provided by each individual user across hundreds or even thousands of users. This paper investigates the trade-off between globally- and personallytrained anti-spam classifiers. We find that globally-trained text classification easily outperforms personally-trained classification under realistic settings. This result does not imply that personalization is not valuable. We show that the two techniques can be combined to produce a modest improvement in overall performance.
Richard Segal
Added 07 Jun 2010
Updated 07 Jun 2010
Type Conference
Year 2007
Where CEAS
Authors Richard Segal
Comments (0)