Robust information-theoretic clustering

14 years 3 months ago
Robust information-theoretic clustering
How do we find a natural clustering of a real world point set, which contains an unknown number of clusters with different shapes, and which may be contaminated by noise? Most clustering algorithms were designed with certain assumptions (Gaussianity), they often require the user to give input parameters, and they are sensitive to noise. In this paper, we propose a robust framework for determining a natural clustering of a given data set, based on the minimum description length (MDL) principle. The proposed framework, Robust Information-theoretic Clustering (RIC), is orthogonal to any known clustering algorithm: given a preliminary clustering, RIC purifies these clusters from noise, and adjusts the clusterings such that it simultaneously determines the most natural amount and shape (subspace) of the clusters. Our RIC method can be combined with any clustering technique ranging from K-means and K-medoids to advanced methods such as spectral clustering. In fact, RIC is even able to purif...
Christian Böhm, Christos Faloutsos, Claudia P
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2006
Where KDD
Authors Christian Böhm, Christos Faloutsos, Claudia Plant, Jia-Yu Pan
Comments (0)