Sciweavers

EDBT
2006
ACM

A Decomposition-Based Probabilistic Framework for Estimating the Selectivity of XML Twig Queries

14 years 4 months ago
A Decomposition-Based Probabilistic Framework for Estimating the Selectivity of XML Twig Queries
In this paper we present a novel approach for estimating the selectivity of XML twig queries. Such a technique is useful for approximate query answering as well as for determining an optimal query plan, based on said estimates, for complex queries. Our approach relies on summary structure that contains occurrence statistics of small twigs. We then present a novel probabilistic approach for decomposing larger twig queries into smaller ones. We then show how in conjunction with the summary information it can be used to estimate the selectivity of the larger query. We present and evaluate two approaches for decomposition and compare this work against a state-of-the-art selectivity estimation approach on synthetic and real datasets. Quantitatively, our results show that the new approach is much more efficient in terms of the time it takes to construct the summary and estimate the selectivity of a twig query. Qualitatively, the new approach is more accurate on most datasets.
Chao Wang, Srinivasan Parthasarathy, Ruoming Jin
Added 08 Dec 2009
Updated 08 Dec 2009
Type Conference
Year 2006
Where EDBT
Authors Chao Wang, Srinivasan Parthasarathy, Ruoming Jin
Comments (0)