Sciweavers

UAI
2000

The Anchors Hierarchy: Using the Triangle Inequality to Survive High Dimensional Data

13 years 5 months ago
The Anchors Hierarchy: Using the Triangle Inequality to Survive High Dimensional Data
This paper is about the use of metric data structures in high-dimensionalor non-Euclidean space to permit cached sufficientstatisticsaccelerationsof learning algorithms. It has recently been shown that for less than about 10 dimensions, decorating kd-trees with additional "cached sufficientstatistics"such as first and second moments and contingency tables can provide satisfying accelerationfor a very wide range of statistical learning tasks such as kernel regression, locally weighted regression, k-means clustering,mixturemodeling and Bayes Net learning. In this paper, we begin by defining the anchors hierarchy---a fast data structure and algorithm for localizing data based only on a triangle-inequality-obeyingdistance metric. We show how this, in its own right, gives a fast and effectiveclustering of data. But more importantly we show how it can produce a well-balanced structure similar to a Ball-Tree (Omohundro 1990) or a kind of metric tree (Uhlmann 1991)in a way that is n...
Andrew W. Moore
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 2000
Where UAI
Authors Andrew W. Moore
Comments (0)