Sciweavers

VLDB
2005
ACM

Summarizing and Mining Inverse Distributions on Data Streams via Dynamic Inverse Sampling

13 years 10 months ago
Summarizing and Mining Inverse Distributions on Data Streams via Dynamic Inverse Sampling
Emerging data stream management systems approach the challenge of massive data distributions which arrive at high speeds while there is only small storage by summarizing and mining the distributions using samples or sketches. However, data distributions can be “viewed” in different ways. A data stream of integer values can be viewed either as the forward distribution f(x), ie., the number of occurrences of x in the stream, or as its inverse, f−1 (i), which is the number of items that appear i times. While both such “views” are equivalent in stored data systems, over data streams that entail approximations, they may be significantly different. In other words, samples and sketches developed for the forward distribution may be ineffective for summarizing or mining the inverse distribution. Yet, many applications such as IP traffic monitoring naturally rely on mining inverse distributions. We formalize the problems of managing and mining inverse distributions and show provable...
Graham Cormode, S. Muthukrishnan, Irina Rozenbaum
Added 28 Jun 2010
Updated 28 Jun 2010
Type Conference
Year 2005
Where VLDB
Authors Graham Cormode, S. Muthukrishnan, Irina Rozenbaum
Comments (0)