Sciweavers

ICDM
2010
IEEE

Learning a Bi-Stochastic Data Similarity Matrix

13 years 2 months ago
Learning a Bi-Stochastic Data Similarity Matrix
An idealized clustering algorithm seeks to learn a cluster-adjacency matrix such that, if two data points belong to the same cluster, the corresponding entry would be 1; otherwise the entry would be 0. This integer (1/0) constraint makes it difficult to find the optimal solution. We propose a relaxation on the cluster-adjacency matrix, by deriving a bi-stochastic matrix from a data similarity (e.g., kernel) matrix according to the Bregman divergence. Our general method is named the Bregmanian Bi-Stochastication (BBS) algorithm. We focus on two popular choices of the Bregman divergence: the Euclidian distance and the KL divergence. Interestingly, the BBS algorithm using the KL divergence is equivalent to the Sinkhorn-Knopp (SK) algorithm for deriving bi-stochastic matrices. We show that the BBS algorithm using the Euclidian distance is closely related to the relaxed k-means clustering and can often produce noticeably superior clustering results than the SK algorithm (and other algorithm...
Fei Wang, Ping Li, Arnd Christian König
Added 12 Feb 2011
Updated 12 Feb 2011
Type Journal
Year 2010
Where ICDM
Authors Fei Wang, Ping Li, Arnd Christian König
Comments (0)