Click fraud is jeopardizing the industry of Internet advertising. Internet advertising is crucial for the thriving of the entire Internet, since it allows producers to advertise t...
Conditional Random Sampling (CRS) was originally proposed for efficiently computing pairwise (l2, l1) distances, in static, large-scale, and sparse data. This study modifies the o...
We continue the study of approximating the number of distinct elements in a data stream of length n to within a (1? ) factor. It is known that if the stream may consist of arbitra...
Because of the high volume and unpredictable arrival rate, stream processing systems may not always be able to keep up with the input data streams-- resulting in buffer overflow a...
While traditional database systems optimize for performance on one-shot queries, emerging large-scale monitoring applications require continuous tracking of complex aggregates and...
Graham Cormode, Minos N. Garofalakis, S. Muthukris...