Sciweavers

SIGMOD
2008
ACM

Sampling cube: a framework for statistical olap over sampling data

14 years 4 months ago
Sampling cube: a framework for statistical olap over sampling data
Sampling is a popular method of data collection when it is impossible or too costly to reach the entire population. For example, television show ratings in the United States are gathered from a sample of roughly 5,000 households. To use the results effectively, the samples are further partitioned in a multidimensional space based on multiple attribute values. This naturally leads to the desirability of OLAP (Online Analytical Processing) over sampling data. However, unlike traditional data, sampling data is inherently uncertain, i.e., not representing the full data in the population. Thus, it is desirable to return not only query results but also the confidence intervals indicating the reliability of the results. Moreover, a certain segment in a multidimensional space may contain none or too few samples. This requires some additional analysis to return trustable results. In this paper we propose a Sampling Cube framework, which efficiently calculates confidence intervals for any multi...
Xiaolei Li, Jiawei Han, Zhijun Yin, Jae-Gil Lee, Y
Added 08 Dec 2009
Updated 08 Dec 2009
Type Conference
Year 2008
Where SIGMOD
Authors Xiaolei Li, Jiawei Han, Zhijun Yin, Jae-Gil Lee, Yizhou Sun
Comments (0)