Social media are becoming increasingly popular and have attracted considerable attention from spammers. Using a sample of more than ninety thousand known spam Web sites, we found ...
This paper describes a new approach to the analysis of Poisson point processes, in time (1D) or space (2D), which is based on the minimum description length (MDL) framework. Speci...
We present a unifying framework for information theoretic feature selection, bringing almost two decades of research on heuristic filter criteria under a single theoretical inter...
Gavin Brown, Adam Pocock, Ming-Jie Zhao, Mikel Luj...
Cluster methods have been successfully applied in gene expression data analysis to address tumor classification. By grouping tissue samples into homogeneous subsets, more systema...
We consider problems on data sets where each data point has uncertainty described by an individual probability distribution. We develop several frameworks and algorithms for calcul...