Sciweavers

DEXA
2003
Springer

Supporting KDD Applications by the k-Nearest Neighbor Join

13 years 9 months ago
Supporting KDD Applications by the k-Nearest Neighbor Join
Abstract. The similarity join has become an important database primitive to support similarity search and data mining. A similarity join combines two sets of complex objects such that the result contains all pairs of similar objects. Well-known are two types of the similarity join, the distance range join where the user defines a distance threshold for the join, and the closest point query or k-distance join which retrieves the k most similar pairs. In this paper, we propose an important, third similarity join operation called k-nearest neighbor join which combines each point of one point set with its k nearest neighbors in the other set. We discover that many standard algorithms of Knowledge Discovery in Databases (KDD) such as k-means and k-medoid clustering, nearest neighbor classification, data cleansing, postprocessing of sampling-based data mining etc. can be implemented on top of the k-nn join operation to achieve performance improvements without affecting the quality of the res...
Christian Böhm, Florian Krebs
Added 06 Jul 2010
Updated 06 Jul 2010
Type Conference
Year 2003
Where DEXA
Authors Christian Böhm, Florian Krebs
Comments (0)