Ranking Continuous Probabilistic Datasets

9 years 8 months ago
Ranking Continuous Probabilistic Datasets
Ranking is a fundamental operation in data analysis and decision support, and plays an even more crucial role if the dataset being explored exhibits uncertainty. This has led to much work in understanding how to rank uncertain datasets in recent years. In this paper, we address the problem of ranking when the tuple scores are uncertain, and the uncertainty is captured using continuous probability distributions (e.g. Gaussian distributions). We present a comprehensive solution to compute the values of a parameterized ranking function (PRF) [19] for arbitrary continuous probability distributions (and thus rank the uncertain dataset); PRF can be used to simulate or approximate many other ranking functions proposed in prior work. We develop exact polynomial time algorithms for some continuous probability distribution classes, and efficient approximation schemes with provable guarantees for arbitrary probability distributions. Our algorithms can also be used for exact or approximate evalu...
Jian Li, Amol Deshpande
Added 30 Jan 2011
Updated 30 Jan 2011
Type Journal
Year 2010
Authors Jian Li, Amol Deshpande
Comments (0)