Sciweavers

27 search results - page 2 / 6
» Probabilistic string similarity joins
Sort
View
ICDE
2006
IEEE
161views Database» more  ICDE 2006»
14 years 6 months ago
A Primitive Operator for Similarity Joins in Data Cleaning
Data cleaning based on similarities involves identification of "close" tuples, where closeness is evaluated using a variety of similarity functions chosen to suit the do...
Surajit Chaudhuri, Venkatesh Ganti, Raghav Kaushik
IQIS
2007
ACM
13 years 6 months ago
Accuracy of Approximate String Joins Using Grams
Approximate join is an important part of many data cleaning and integration methodologies. Various similarity measures have been proposed for accurate and efficient matching of st...
Oktie Hassanzadeh, Mohammad Sadoghi, Renée ...
PVLDB
2010
126views more  PVLDB 2010»
13 years 3 months ago
Set Similarity Join on Probabilistic Data
Set similarity join has played an important role in many real-world applications such as data cleaning, near duplication detection, data integration, and so on. In these applicati...
Xiang Lian, Lei Chen 0002
SBBD
2007
149views Database» more  SBBD 2007»
13 years 6 months ago
Embedding Similarity Joins into Native XML Databases
Similarity joins in databases can be used for several important tasks such as data cleaning and instance-based data integration. In this paper, we explore ways how to support such ...
Leonardo Ribeiro, Theo Härder
WWW
2004
ACM
14 years 5 months ago
Web data integration using approximate string join
Web data integration is an important preprocessing step for web mining. It is highly likely that several records on the web whose textual representations differ may represent the ...
Yingping Huang, Gregory R. Madey