Sampling streams of continuous data with limited memory, or reservoir sampling, is a utility algorithm. Standard reservoir sampling maintains a random sample of the entire stream a...
Data mining has recently attracted attention as a set of efficient techniques that can discover patterns from huge data. More recent advancements in collecting massive evolving da...
Monitoring the performance of large shared computing systems such as the cloud computing infrastructure raises many challenging algorithmic problems. One common problem is to trac...
Chiranjeeb Buragohain, Luca Foschini, Subhash Suri
We address the problem of computing approximate answers to continuous sliding-window joins over data streams when the available memory may be insufficient to keep the entire join...
While there has been a lot of work on finding frequent itemsets in transaction data streams, none of these solve the problem of finding similar pairs according to standard similar...