Sciweavers

EDBT
2009
ACM

PROUD: a probabilistic approach to processing similarity queries over uncertain data streams

13 years 9 months ago
PROUD: a probabilistic approach to processing similarity queries over uncertain data streams
We present PROUD - A PRObabilistic approach to processing similarity queries over Uncertain Data streams, where the data streams here are mainly time series streams. In contrast to data with certainty, an uncertain series is an ordered sequence of random variables. The distance between two uncertain series is also a random variable. We use a general uncertain data model, where only the mean and the deviation of each random variable at each timestamp are available. We derive mathematical conditions for progressively pruning candidates to reduce the computation cost. We then apply PROUD to a streaming environment where only sketches of streams, like wavelet synopses, are available. Extensive experiments are conducted to evaluate the effectiveness of PROUD and compare it with Det, a deterministic approach that directly processes data without considering uncertainty. The results show that, compared with Det, PROUD offers a flexible trade-off between false positives and false negatives by...
Mi-Yen Yeh, Kun-Lung Wu, Philip S. Yu, Ming-Syan C
Added 24 Jul 2010
Updated 24 Jul 2010
Type Conference
Year 2009
Where EDBT
Authors Mi-Yen Yeh, Kun-Lung Wu, Philip S. Yu, Ming-Syan Chen
Comments (0)