Sciweavers

ICDE
2009
IEEE

Leveraging COUNT Information in Sampling Hidden Databases

14 years 6 months ago
Leveraging COUNT Information in Sampling Hidden Databases
A large number of online databases are hidden behind form-like interfaces which allow users to execute search queries by specifying selection conditions in the interface. Most of these interfaces return restricted answers (e.g., only top-k of the selected tuples), while many of them also accompany each answer with the COUNT of the selected tuples. In this paper, we propose techniques which leverage the COUNT information to efficiently acquire unbiased samples of the hidden database. We also discuss variants for interfaces which do not provide COUNT information. We conduct extensive experiments to illustrate the efficiency and accuracy of our techniques.
Arjun Dasgupta, Nan Zhang, Gautam Das
Added 20 Oct 2009
Updated 20 Oct 2009
Type Conference
Year 2009
Where ICDE
Authors Arjun Dasgupta, Nan Zhang, Gautam Das
Comments (0)