High-Dimensional Similarity Joins

16 years 9 months ago

Download rakesh.agrawal-family.com

Many emerging data mining applications require a similarity join between points in a high-dimensional domain. We present a new algorithm that utilizes a new index structure, called the -kdB tree, for fast spatial similarity joins on high-dimensional points. This index structure reduces the number of neighboring leaf nodes that are considered for the join test, as well as the traversal cost of finding appropriate branches in the internal nodes. The storage cost for internal nodes is independent of the number of dimensions. Hence the proposed index structure scales to highdimensional data. Empirical evaluation, using synthetic and real-life datasets, shows that similarity join using the -kdB tree is 2 to an order of magnitude faster than the R+ tree, with the performance gap increasing with the number of dimensions.

Kyuseok Shim, Ramakrishnan Srikant, Rakesh Agrawal

Real-time Traffic

Database | Fast Spatial Similarity | ICDE 1997 | Internal Nodes | Similarity Join |

claim paper

Post Info
More Details (n/a)

Added	01 Nov 2009
Updated	01 Nov 2009
Type	Conference
Year	1997
Where	ICDE
Authors	Kyuseok Shim, Ramakrishnan Srikant, Rakesh Agrawal

Comments (0)

Sciweavers

High-Dimensional Similarity Joins

Database | Fast Spatial Similarity | ICDE 1997 | Internal Nodes | Similarity Join |

Explore & Download

Productivity Tools

Sciweavers