Analysis of traffic logs of email received by a large UK ISP shows considerable disparity between the proportions of spam received by addresses with different first characters. Th...
Noise analysis of higher-order translinear filters cannot be established through straight-forward extension of analysis techniques for first-order TL filters, due to the presence n...
Michiel H. L. Kouwenhoven, J. Mulder, Wouter A. Se...
One of the biggest challenges in building effective anti-spam solutions is designing systems to defend against the everevolving bag of tricks spammers use to defeat them. Because ...
This paper describes our opinion retrieval system for TREC 2008 blog track. We focused on five different aspects of the system. The first module is focussed on extracting the blog...
Co-training is a semi-supervised technique that allows classifiers to learn with fewer labelled documents by taking advantage of the more abundant unclassified documents. However, ...