Sciweavers

SODA
2008
ACM

Clustering for metric and non-metric distance measures

13 years 6 months ago
Clustering for metric and non-metric distance measures
We study a generalization of the k-median problem with respect to an arbitrary dissimilarity measure D. Given a finite set P, our goal is to find a set C of size k such that the sum of errors D(P, C) = pP mincC D(p, c) is minimized. The main result in this paper can be stated as follows: There exists an O n2( k )O(1) time (1 + )-approximation algorithm for the k-median problem with respect to D, if the 1median problem can be approximated within a factor of (1 + ) by taking a random sample of constant size and solving the 1-median problem on the sample exactly. Using this characterization, we obtain the first linear time (1+ )-approximation algorithms for the k-median problem in an arbitrary metric space with bounded doubling dimension, for the Kullback-Leibler divergence (relative entropy), for Mahalanobis distances, and for some special cases of Bregman divergences. Moreover, we obtain previously known results for the Euclidean k-median problem and the Euclidean k-means problem in a ...
Marcel R. Ackermann, Johannes Blömer, Christi
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2008
Where SODA
Authors Marcel R. Ackermann, Johannes Blömer, Christian Sohler
Comments (0)