Robust Data Clustering

14 years 9 months ago
Robust Data Clustering
We address the problem of robust clustering by combining data partitions (forming a clustering ensemble) produced by multiple clusterings. We formulate robust clustering under an information-theoretical framework; mutual information is the underlying concept used in the definition of quantitative measures of agreement or consistency between data partitions. Robustness is assessed by variance of the cluster membership, based on bootstrapping. We propose and analyze a voting mechanism on pairwise associations of patterns for combining data partitions. We show that the proposed technique attempts to optimize the mutual information based criteria, although the optimality is not ensured in all situations. This evidence accumulation method is demonstrated by combining the well-known Kmeans algorithm to produce clustering ensembles. Experimental results show the ability of the technique to identify clusters with arbitrary shapes and sizes.
Ana L. N. Fred, Anil K. Jain
Added 12 Oct 2009
Updated 12 Oct 2009
Type Conference
Year 2003
Where CVPR
Authors Ana L. N. Fred, Anil K. Jain
Comments (0)