In this paper we present an approach to the synthesis of fault-tolerant schedules for embedded applications with soft and hard real-time constraints. We are interested to guarante...
Viacheslav Izosimov, Paul Pop, Petru Eles, Zebo Pe...
We present a tunable diagnostic protocol for generic time-triggered (TT) systems to detect crash and send/receive omission faults. Compared to existing diagnostic and membership p...
Marco Serafini, Neeraj Suri, Jonny Vinter, Astrit ...
This paper tests the hypothesis that generic recovery techniques, such as process pairs, can survive most application faults without using application-specific information. We ex...
Designing a distributed fault tolerance algorithm requires careful analysis of both fault models and diagnosis strategies. A system will fail if there are too many active faults, ...
: We present a new approach to fault tolerance for High Performance Computing system. Our approach is based on a careful adaptation of the Algorithmic Based Fault Tolerance techniq...
George Bosilca, Remi Delmas, Jack Dongarra, Julien...