Most clusters so far have used WAN or LAN-based network products for communication due to their market availability. However, they do not always match communication patterns in cl...
Existing low-latency protocols make unrealistically strong assumptions about reliability. This allows them to achieve impressive performance, but also prevents this performance bei...
Stephen R. Donaldson, Jonathan M. D. Hill, David B...
The goal of OWL (Object-Oriented Workplace Laboratory) is to provide an object-oriented and component-based framework that supports the engineering of applications for the design, ...
: A middleware architecture named ROAFTS (Real-time Object-oriented Adaptive Fault Tolerance Support) is presented. ROAFTS is designed to support adaptive fault-tolerant execution ...
: The distributed recovery block (DRB) scheme is a widely applicable approach for realizing both hardware and software fault tolerance in real-time distributed and parallel compute...