Sciweavers

Share
SODA
2008
ACM

Clustering for metric and non-metric distance measures

8 years 11 months ago
Clustering for metric and non-metric distance measures
We study a generalization of the k-median problem with respect to an arbitrary dissimilarity measure D. Given a finite set P, our goal is to find a set C of size k such that the sum of errors D(P, C) = pP mincC D(p, c) is minimized. The main result in this paper can be stated as follows: There exists an O n2( k )O(1) time (1 + )-approximation algorithm for the k-median problem with respect to D, if the 1median problem can be approximated within a factor of (1 + ) by taking a random sample of constant size and solving the 1-median problem on the sample exactly. Using this characterization, we obtain the first linear time (1+ )-approximation algorithms for the k-median problem in an arbitrary metric space with bounded doubling dimension, for the Kullback-Leibler divergence (relative entropy), for Mahalanobis distances, and for some special cases of Bregman divergences. Moreover, we obtain previously known results for the Euclidean k-median problem and the Euclidean k-means problem in a ...
Marcel R. Ackermann, Johannes Blömer, Christi
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2008
Where SODA
Authors Marcel R. Ackermann, Johannes Blömer, Christian Sohler
Comments (0)
books