Random sampling is a popular technique for providing fast approximate query answers, especially in data warehouse environments. Compared to other types of synopses, random sampling...
Many real world applications such as sensor networks and other monitoring applications naturally generate probabilistic streams that are highly correlated in both time and space. ...
We consider the problem of efficiently finding the top-k answers for join queries over web-accessible databases. Classical algorithms for finding top-k answers use branch-and-b...
In this paper, we will examine the problem of clustering massive domain data streams. Massive-domain data streams are those in which the number of possible domain values for each a...
Window queries are proving essential to data-stream processing. In this paper, we present an approach for evaluating sliding-window aggregate queries that reduces both space and c...
Jin Li, David Maier, Kristin Tufte, Vassilis Papad...