Sciweavers

PODS
2006
ACM

Space- and time-efficient deterministic algorithms for biased quantiles over data streams

14 years 4 months ago
Space- and time-efficient deterministic algorithms for biased quantiles over data streams
Skew is prevalent in data streams, and should be taken into account by algorithms that analyze the data. The problem of finding "biased quantiles"-- that is, approximate quantiles which must be more accurate for more extreme values -- is a framework for summarizing such skewed data on data streams. We present the first deterministic algorithms for answering biased quantiles queries accurately with small--sublinear in the input size-- space and time bounds in one pass. The space bound is near-optimal, and the amortized update cost is close to constant, making it practical for handling high speed network data streams. We not only demonstrate theoretical properties of the algorithm, but also show it uses less space than existing methods in many practical settings, and is fast to maintain. Keywords Data Stream Algorithms, Biased Quantiles General Terms Algorithms, Performance Categories and Subject Descriptors E.1 [Data]: Data Structures; F.2 [Theory]: Analysis of Algorithms
Graham Cormode, Flip Korn, S. Muthukrishnan, Dives
Added 08 Dec 2009
Updated 08 Dec 2009
Type Conference
Year 2006
Where PODS
Authors Graham Cormode, Flip Korn, S. Muthukrishnan, Divesh Srivastava
Comments (0)