A new approach to data driven clustering

12 years 6 months ago
A new approach to data driven clustering
We consider the problem of clustering in its most basic form where only a local metric on the data space is given. No parametric statistical model is assumed, and the number of clusters is learned from the data. We introduce, analyze and demonstrate a novel approach to clustering where data points are viewed as nodes of a graph, and pairwise similarities are used to derive a transition probability matrix P for a Markov random walk between them. The algorithm automatically reveals structure at increasing scales by varying the number of steps taken by this random walk. Points are represented as rows of Pt , which are the t-step distributions of the walk starting at that point; these distributions are then clustered using a KL-minimizing iterative algorithm. Both the number of clusters, and the number of steps that `best reveal' it, are found by optimizing spectral properties of P.
Arik Azran, Zoubin Ghahramani
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2006
Where ICML
Authors Arik Azran, Zoubin Ghahramani
Comments (0)