This paper describes the use of fault tolerance in a multiagent system. Such an approach is based on the modeling of autonomous agents with planning capabilities. These capabiliti...
Model checking is shown to be an effective tool in validating the behavior of a fault tolerant embedded spacecraft controller. The case study presented here at by judiciously abst...
Francis Schneider, Steve M. Easterbrook, John R. C...
Protocols that solve agreement problems are essential building blocks for fault tolerant distributed systems. While many protocols have been published, little has been done to ana...
Dynamic error processing approaches are an important mechanism to increase the reliability in a multiprocessor system, while making efficient use of the available resources. To th...
Andrea Bondavalli, Silvano Chiaradonna, Felicita D...
Fault tolerance is an important issue for large machines with tens or hundreds of thousands of processors. Checkpoint-based methods, currently used on most machines, rollback all ...