Essentially all data mining algorithms assume that the datagenerating process is independent of the data miner's activities. However, in many domains, including spam detectio...
Nilesh N. Dalvi, Pedro Domingos, Mausam, Sumit K. ...
We approached this line of inquiry by questioning the conventional wisdom that audit logs are too large to be analyzed and must be reduced and filtered before the data can be anal...
In this paper, we describe the JAM system, a distributed, scalable and portable agent-based data mining system that employs a general approach to scaling data mining applications ...
Salvatore J. Stolfo, Andreas L. Prodromidis, Shell...
We address the problem of filtering image spam, a rapidly spreading kind of spam in which the text message is embedded into attached images to defeat spam filtering techniques b...
Battista Biggio, Giorgio Fumera, Ignazio Pillai, F...
We introduce a technique for detecting anomalous patterns in a categorical feature (one that takes values from a finite alphabet). It differs from most anomaly detection methods u...