Sciweavers

PKDD
2000
Springer

Fast Hierarchical Clustering Based on Compressed Data and OPTICS

13 years 8 months ago
Fast Hierarchical Clustering Based on Compressed Data and OPTICS
: One way to scale up clustering algorithms is to squash the data by some intelligent compression technique and cluster only the compressed data records. Such compressed data records can e.g. be produced by the BIRCH algorithm. Typically they consist of the sufficient statistics of the form (N, X, X2 ) where N is the number of points, X is the (vector-)sum, and X2 is the square sum of the points. They can be used directly to speed up k-means type of clustering algorithms, but it is not obvious how to use them in a hierarchical clustering algorithm. Applying a hierarchical clustering algorithm e.g. to the centers of compressed subclusters produces a very weak result. The reason is that hierarchical clustering algorithms are based on the distances between data points and that the interpretaion of the result relies heavily on a correct graphical representation of these distances. In this paper, we introduce a method by which the sufficient statistics (N, X, X2 ) of subclusters can be util...
Markus M. Breunig, Hans-Peter Kriegel, Jörg S
Added 25 Aug 2010
Updated 25 Aug 2010
Type Conference
Year 2000
Where PKDD
Authors Markus M. Breunig, Hans-Peter Kriegel, Jörg Sander
Comments (0)