Sample synopses for approximate answering of group-by queries

11 years 4 months ago
Sample synopses for approximate answering of group-by queries
With the amount of data in current data warehouse databases growing steadily, random sampling is continuously gaining in importance. In particular, interactive analyses of large datasets can greatly benefit from the significantly shorter response times of approximate query processing. Typically, those analytical queries partition the data into groups and aggregate the values within the groups. Further, with the commonly used roll-up and drill-down operations a broad range of group-by queries is posed to the system, which makes the construction of highly-specialized synopses difficult. In this paper, we propose a general-purpose sampling scheme that is biased in order to answer group-by queries with high accuracy. While existing techniques focus on the size of the group when computing its sample size, our technique is based on its standard deviation. The basic idea is that the more homogeneous a group is, the less representatives are required in order to give a good estimate. With an...
Philipp Rösch, Wolfgang Lehner
Added 04 Sep 2010
Updated 04 Sep 2010
Type Conference
Year 2009
Where EDBT
Authors Philipp Rösch, Wolfgang Lehner
Comments (0)