We present a technique that masks failures in a cluster to provide high availability and fault-tolerance for long-running, parallelized dataflows. We can use these dataflows to im...
Mehul A. Shah, Joseph M. Hellerstein, Eric A. Brew...
The skyline of a set of d-dimensional points contains the points that are not dominated by any other point on all dimensions. Skyline computation has recently received considerabl...
Dimitris Papadias, Yufei Tao, Greg Fu, Bernhard Se...
In this paper, we investigate how to scale hierarchical clustering methods (such as OPTICS) to extremely large databases by utilizing data compression methods (such as BIRCH or ra...
Markus M. Breunig, Hans-Peter Kriegel, Peer Kr&oum...
We consider the fundamental operation of applying a conjunction of selection conditions to a set of records. With large main memories available cheaply, systems may choose to keep...
We consider a scenario where we want to query a large dataset that is stored in external memory and does not fit into main memory. The most constrained resources in such a situati...