Sciweavers

5 search results - page 1 / 1
» Epsilon Grid Order: An Algorithm for the Similarity Join on ...
Sort
View
SIGMOD
2001
ACM
193views Database» more  SIGMOD 2001»
14 years 4 months ago
Epsilon Grid Order: An Algorithm for the Similarity Join on Massive High-Dimensional Data
The similarity join is an important database primitive which has been successfully applied to speed up applications such as similarity search, data analysis and data mining. The s...
Christian Böhm, Bernhard Braunmüller, Fl...
ICDE
1997
IEEE
130views Database» more  ICDE 1997»
14 years 5 months ago
High-Dimensional Similarity Joins
Many emerging data mining applications require a similarity join between points in a high-dimensional domain. We present a new algorithm that utilizes a new index structure, calle...
Kyuseok Shim, Ramakrishnan Srikant, Rakesh Agrawal
KDD
2001
ACM
253views Data Mining» more  KDD 2001»
14 years 4 months ago
GESS: a scalable similarity-join algorithm for mining large data sets in high dimensional spaces
The similarity join is an important operation for mining high-dimensional feature spaces. Given two data sets, the similarity join computes all tuples (x, y) that are within a dis...
Jens-Peter Dittrich, Bernhard Seeger
ICPP
2000
IEEE
13 years 9 months ago
A Scalable Parallel Subspace Clustering Algorithm for Massive Data Sets
Clustering is a data mining problem which finds dense regions in a sparse multi-dimensional data set. The attribute values and ranges of these regions characterize the clusters. ...
Harsha S. Nagesh, Sanjay Goil, Alok N. Choudhary
WWW
2004
ACM
14 years 5 months ago
Web data integration using approximate string join
Web data integration is an important preprocessing step for web mining. It is highly likely that several records on the web whose textual representations differ may represent the ...
Yingping Huang, Gregory R. Madey