Sciweavers

PODS
2010
ACM

Understanding cardinality estimation using entropy maximization

13 years 9 months ago
Understanding cardinality estimation using entropy maximization
Cardinality estimation is the problem of estimating the number of tuples returned by a query; it is a fundamentally important task in data management, used in query optimization, progress estimation, and resource provisioning. We study cardinality estimation in a principled framework: given a set of statistical assertions about the number of tuples returned by a fixed set of queries, predict the number of tuples returned by a new query. We model this problem using the probability space, over possible worlds, that satisfies all provided statistical assertions and maximizes entropy. We call this the Entropy Maximization model for statistics (MaxEnt). In this paper we develop the mathematical techniques needed to use the MaxEnt model for predicting the cardinality of conjunctive queries. Categories and Subject Descriptors H.2.4 [Systems]: Relational Databases General Terms Theory Keywords Cardinality Estimation, Database Theory, Maximum Entropy, Distinct Value Estimation
Christopher Ré, Dan Suciu
Added 10 Jul 2010
Updated 10 Jul 2010
Type Conference
Year 2010
Where PODS
Authors Christopher Ré, Dan Suciu
Comments (0)