Scalable Density-Based Distributed Clustering

10 years 7 months ago
Scalable Density-Based Distributed Clustering
Clustering has become an increasingly important task in analysing huge amounts of data. Traditional applications require that all data has to be located at the site where it is scrutinized. Nowadays, large amounts of heterogeneous, complex data reside on different, independently working computers which are connected to each other via local or wide area networks. In this paper, we propose a scalable density-based distributed clustering algorithm which allows a user-defined trade-off between clustering quality and the number of transmitted objects from the different local sites to a global server site. Our approach consists of the following steps: First, we order all objects located at a local site according to a quality criterion reflecting their suitability to serve as local representatives. Then we send the best of these representatives to a server site where they are clustered with a slightly enhanced density-based clustering algorithm. This approach is very efficient, because the lo...
Eshref Januzaj, Hans-Peter Kriegel, Martin Pfeifle
Added 02 Jul 2010
Updated 02 Jul 2010
Type Conference
Year 2004
Where PKDD
Authors Eshref Januzaj, Hans-Peter Kriegel, Martin Pfeifle
Comments (0)