Online Discriminative Spam Filter Training

12 years 2 days ago
Online Discriminative Spam Filter Training
We describe a very simple technique for discriminatively training a spam filter. Our results on the TREC Enron spam corpus would have been the best for the Ham at .1% measure, and second best by the 1-ROCA measure. For the Mr. X corpus, our 1-ROCA measure was a close second best, and third best by the Ham at .1% measure. We use a very simple feature extractor (all words in the subject and headers). Our learning algorithm is also very simple: gradient descent of a logistic regression model.
Joshua Goodman, Wen-tau Yih
Added 20 Aug 2010
Updated 20 Aug 2010
Type Conference
Year 2006
Where CEAS
Authors Joshua Goodman, Wen-tau Yih
Comments (0)