The UpRight library seeks to make Byzantine fault tolerance (BFT) a simple and viable alternative to crash fault tolerance for a range of cluster services. We demonstrate UpRight ...
Allen Clement, Manos Kapritsos, Sangmin Lee, Yang ...
A challenging issue in today's server systems is to transparently deal with failures and application-imposed requirements for continuous operation. In this paper we address t...
As computational clusters increase in size, their mean-time-to-failure reduces. Typically checkpointing is used to minimize the loss of computation. Most checkpointing techniques, ...
Abstract— This paper focuses on the Delay/Fault-Tolerant Mobile Sensor Network (DFT-MSN) for pervasive information gathering. We develop simple and efficient data delivery schem...
The probability that a failure will occur before the end of the computation increases as the number of processors used in a high performance computing application increases. For l...