Sciweavers

COLT
2008
Springer

Finding Metric Structure in Information Theoretic Clustering

13 years 11 months ago
Finding Metric Structure in Information Theoretic Clustering
We study the problem of clustering discrete probability distributions with respect to the Kullback-Leibler (KL) divergence. This problem arises naturally in many applications. Our goal is to pick k distributions as "representatives" such that the average or maximum KLdivergence between an input distribution and the closest representative distribution is minimized. Unfortunately, no polynomial-time algorithms with worst-case performance guarantees are known for either of these problems. The analogous problems for l1, l2 and l2 2 (i.e., k-center, k-median and k-means) have been extensively studied and efficient algorithms with good approximation guarantees are known. However, these algorithms rely crucially on the (geo-)metric properties of these metrics and do not apply to KL-divergence. In this paper, our contribution is to find a "relaxed" metricstructure for KL-divergence. In doing so, we provide the first polynomial-time algorithm for clustering using KL-diverge...
Kamalika Chaudhuri, Andrew McGregor
Added 18 Oct 2010
Updated 18 Oct 2010
Type Conference
Year 2008
Where COLT
Authors Kamalika Chaudhuri, Andrew McGregor
Comments (0)