Clustering with Bregman Divergences

11 years 11 months ago
Clustering with Bregman Divergences
A wide variety of distortion functions, such as squared Euclidean distance, Mahalanobis distance, Itakura-Saito distance and relative entropy, have been used for clustering. In this paper, we propose and analyze parametric hard and soft clustering algorithms based on a large class of distortion functions known as Bregman divergences. The proposed algorithms unify centroid-based parametric clustering approaches, such as classical kmeans, the Linde-Buzo-Gray (LBG) algorithm and information-theoretic clustering, which arise by special choices of the Bregman divergence. The algorithms maintain the simplicity and scalability of the classical kmeans algorithm, while generalizing the method to a large class of clustering loss functions. This is achieved by first posing the hard clustering problem in terms of minimizing the loss in Bregman information, a quantity motivated by rate distortion theory, and then deriving an iterative algorithm that monotonically decreases this loss. In addition, ...
Arindam Banerjee, Srujana Merugu, Inderjit S. Dhil
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2004
Where SDM
Authors Arindam Banerjee, Srujana Merugu, Inderjit S. Dhillon, Joydeep Ghosh
Comments (0)