Stream-processing systems are designed to support an emerging class of applications that require sophisticated and timely processing of high-volume data streams, often originating...
Alex Rasin, Jeong-Hyon Hwang, Magdalena Balazinska...
This paper addresses the recovery and the rollback problem in distributed collaborative transactions. We propose a solution to the problem in a generalized ARIES [9] framework. We...
Reliable multicast communication is important in large-scale distributed applications. For example, reliable multicast is used to transmit terrain and environmental updates in dis...
Hugh W. Holbrook, Sandeep K. Singhal, David R. Che...
Fault tolerance is an important issue for large machines with tens or hundreds of thousands of processors. Checkpoint-based methods, currently used on most machines, rollback all ...
Execution of MPI applications on Clusters and Grid deployments suffers from node and network failure that motivates the use of fault tolerant MPI implementations. Two category tec...