Sciweavers

ICDE
2006
IEEE

Reasoning About Approximate Match Query Results

14 years 5 months ago
Reasoning About Approximate Match Query Results
Join techniques deploying approximate match predicates are fundamental data cleaning operations. A variety of predicates have been utilized to quantify approximate match in such operations and some have been embedded in a declarative data cleaning framework. These techniques return pairs of tuples from both relations, tagged with a score, signifying the degree of similarity between the tuples in the pair according to the specific approximate match predicate. In this paper we consider the problem of estimating various parameters on the output of declarative approximate join algorithms for planning purposes. Such algorithms are highly time consuming, so precise knowledge of the result size as well as its score distribution is a pressing concern. This knowledge aids decisions as to which operations are more promising for identifying highly similar tuples which is a key operation for data cleaning. We propose solution strategies that fully comply with a declarative framework and analytica...
Sudipto Guha, Nick Koudas, Divesh Srivastava, Xiao
Added 01 Nov 2009
Updated 01 Nov 2009
Type Conference
Year 2006
Where ICDE
Authors Sudipto Guha, Nick Koudas, Divesh Srivastava, Xiaohui Yu
Comments (0)