Sciweavers

176 search results - page 1 / 36
» Performance prediction for set similarity joins
Sort
View
ICDM
2002
IEEE
163views Data Mining» more  ICDM 2002»
13 years 9 months ago
High Performance Data Mining Using the Nearest Neighbor Join
The similarity join has become an important database primitive to support similarity search and data mining. A similarity join combines two sets of complex objects such that the r...
Christian Böhm, Florian Krebs
ICDE
2009
IEEE
194views Database» more  ICDE 2009»
14 years 6 months ago
Top-k Set Similarity Joins
Abstract-- Similarity join is a useful primitive operation underlying many applications, such as near duplicate Web page detection, data integration, and pattern recognition. Tradi...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Haichuan Sh...
ICDE
1998
IEEE
142views Database» more  ICDE 1998»
14 years 5 months ago
High Dimensional Similarity Joins: Algorithms and Performance Evaluation
Current data repositories include a variety of data types, including audio, images and time series. State of the art techniques for indexing such data and doing query processing r...
Nick Koudas, Kenneth C. Sevcik
PVLDB
2010
195views more  PVLDB 2010»
12 years 11 months ago
Trie-Join: Efficient Trie-based String Similarity Joins with Edit-Distance Constraints
A string similarity join finds similar pairs between two collections of strings. It is an essential operation in many applications, such as data integration and cleaning, and has ...
Jiannan Wang, Guoliang Li, Jianhua Feng
SIGMOD
2004
ACM
182views Database» more  SIGMOD 2004»
14 years 4 months ago
Efficient set joins on similarity predicates
In this paper we present an efficient, scalable and general algorithm for performing set joins on predicates involving various similarity measures like intersect size, Jaccard-coe...
Sunita Sarawagi, Alok Kirpal