Sciweavers

KAIS
2000

An Index Structure for Data Mining and Clustering

13 years 4 months ago
An Index Structure for Data Mining and Clustering
Abstract. In this paper we present an index structure, called MetricMap, that takes a set of objects and a distance metric and then maps those objects to a k-dimensional space in such a way that the distances among objects are approximately preserved. The index structure is a useful tool for clustering and visualization in data intensive applications, because it replaces expensive distance calculations by sum-of-square calculations. This can make clustering in large databases with expensive distance metrics practical. We compare the index structure with another data mining index structure, FastMap, recently proposed by Faloutsos and Lin, according to two criteria: relative error and clustering accuracy. For relative error, we show that (i) FastMap gives a lower relative error than MetricMap for Euclidean distances, (ii) MetricMap gives a lower relative error than FastMap for non-Euclidean distances (i.e., general distance metrics), and (iii) combining the two reduces the error yet furt...
Xiong Wang, Jason Tsong-Li Wang, King-Ip Lin, Denn
Added 19 Dec 2010
Updated 19 Dec 2010
Type Journal
Year 2000
Where KAIS
Authors Xiong Wang, Jason Tsong-Li Wang, King-Ip Lin, Dennis Shasha, Bruce A. Shapiro, Kaizhong Zhang
Comments (0)