Mergeable summaries

13 years 7 months ago

Download www.cs.utah.edu

We study the mergeability of data summaries. Informally speaking, mergeability requires that, given two summaries on two data sets, there is a way to merge the two summaries into a single summary on the union of the two data sets, while preserving the error and size guarantees. This property means that the summaries can be merged in a way like other algebraic operators such as sum and max, which is especially useful for computing summaries on massive distributed data. Several data summaries are trivially mergeable by construction, most notably all the sketches that are linear functions of the data sets. But some other fundamental ones like those for heavy hitters and quantiles, are not (known to be) mergeable. In this paper, we demonstrate that these summaries are indeed mergeable or can be made mergeable after appropriate modiﬁcations. Specifically, we show that for ε-approximate heavy hitters, there is a deterministic mergeable summary of size O(1/ε); for εapproximate quantiles...

Pankaj K. Agarwal, Graham Cormode, Zengfeng Huang,

Real-time Traffic

Algebraic Operators | Database | Faculty Research Award | PODS 2012 | Size Guarantees |

claim paper

Post Info
More Details (n/a)

Added	27 Sep 2012
Updated	27 Sep 2012
Type	Journal
Year	2012
Where	PODS
Authors	Pankaj K. Agarwal, Graham Cormode, Zengfeng Huang, Jeff M. Phillips, Zhewei Wei, Ke Yi

Comments (0)

Sciweavers

Mergeable summaries

Algebraic Operators | Database | Faculty Research Award | PODS 2012 | Size Guarantees |

Explore & Download

Productivity Tools

Sciweavers