Sciweavers

SIGMOD
2004
ACM

CORDS: Automatic Discovery of Correlations and Soft Functional Dependencies

14 years 4 months ago
CORDS: Automatic Discovery of Correlations and Soft Functional Dependencies
The rich dependency structure found in the columns of real-world relational databases can be exploited to great advantage, but can also cause query optimizers--which usually assume that columns are statistically independent--to underestimate the selectivities of conjunctive predicates by orders of magnitude. We introduce cords, an efficient and scalable tool for automatic discovery of correlations and soft functional dependencies between columns. cords searches for column pairs that might have interesting and useful dependency relations by systematically enumerating candidate pairs and simultaneously pruning unpromising candidates using a flexible set of heuristics. A robust chi-squared analysis is applied to a sample of column values in order to identify correlations, and the number of distinct values in the sampled columns is analyzed to detect soft functional dependencies. cords can be used as a data mining tool, producing dependency graphs that are of intrinsic interest. We focus ...
Ihab F. Ilyas, Volker Markl, Peter J. Haas, Paul B
Added 08 Dec 2009
Updated 08 Dec 2009
Type Conference
Year 2004
Where SIGMOD
Authors Ihab F. Ilyas, Volker Markl, Peter J. Haas, Paul Brown, Ashraf Aboulnaga
Comments (0)