Data aggregation plays an important role in the design of scalable systems, allowing the determination of meaningful system-wide properties to direct the execution of distributed a...
We address the problem of content-based dissemination of highly-distributed, high-volume data streams for stream-based monitoring applications and large-scale data delivery. Exist...
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
This paper proposes a new parallel execution model where programmers augment a sequential program with pieces of code called serializers that dynamically map computational operati...
Matthew D. Allen, Srinath Sridharan, Gurindar S. S...
In a cluster of multiple processors or cpu-cores, many processes may run on each compute node. Each process tends to issue contiguous I/O requests for snapshot, checkpointing or s...