Sciweavers

Share
KDD
2004
ACM

A probabilistic framework for semi-supervised clustering

9 years 3 months ago
A probabilistic framework for semi-supervised clustering
Unsupervised clustering can be significantly improved using supervision in the form of pairwise constraints, i.e., pairs of instances labeled as belonging to same or different clusters. In recent years, a number of algorithms have been proposed for enhancing clustering quality by employing such supervision. Such methods use the constraints to either modify the objective function, or to learn the distance measure. We propose a probabilistic model for semisupervised clustering based on Hidden Markov Random Fields (HMRFs) that provides a principled framework for incorporating supervision into prototype-based clustering. The model generalizes a previous approach that combines constraints and Euclidean distance learning, and allows the use of a broad range of clustering distortion measures, including Bregman divergences (e.g., Euclidean distance and I-divergence) and directional similarity measures (e.g., cosine similarity). We present an algorithm that performs partitional semi-supervised...
Sugato Basu, Mikhail Bilenko, Raymond J. Mooney
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2004
Where KDD
Authors Sugato Basu, Mikhail Bilenko, Raymond J. Mooney
Comments (0)
books