Essentially all data mining algorithms assume that the datagenerating process is independent of the data miner's activities. However, in many domains, including spam detectio...
Nilesh N. Dalvi, Pedro Domingos, Mausam, Sumit K. ...
Background: Microarray experiments measure changes in the expression of thousands of genes. The resulting lists of genes with changes in expression are then searched for biologica...
We present a general approach to model selection and regularization that exploits unlabeled data to adaptively control hypothesis complexity in supervised learning tasks. The idea ...
Abstract--Large-scale attacks like Distributed Denial-ofService (DDoS) attacks still pose unpredictable threats to the Internet infrastructure and Internet-based business. Thus, ma...
We consider the task of performing anomaly detection in highly noisy multivariate data. In many applications involving real-valued time-series data, such as physical sensor data a...