Unsupervised clustering can be significantly improved using supervision in the form of pairwise constraints, i.e., pairs of instances labeled as belonging to same or different clu...
Classifiers are traditionally learned using sets of positive and negative training examples. However, often a classifier is required, but for training only an incomplete set of pos...
Supervised learning from multiple labeling sources is an increasingly important problem in machine learning and data mining. This paper develops a probabilistic approach to this p...
We briefly survey several privacy compromises in published datasets, some historical and some on paper. An inspection of these suggests that the problem lies with the nature of the...
Abstract-- Many applications are driven by evolving data -patterns in web traffic, program execution traces, network event logs, etc., are often non-stationary. Building prediction...
Shixi Chen, Haixun Wang, Shuigeng Zhou, Philip S. ...