This paper tests the hypothesis that generic recovery techniques, such as process pairs, can survive most application faults without using application-specific information. We ex...
We present a method for performing fault localization using similar program spectra. Our method assumes the existence of a faulty run and a larger number of correct runs. It then ...
: We explore the abstraction of failure transparency in which the operating system provides the illusion of failure-free operation. To provide failure transparency, an operating sy...
David E. Lowell, Subhachandra Chandra, Peter M. Ch...
Recovery systems must save state before a failure occurs to enable the system to recover from the failure. However, recovery will fail if the recovery system saves any state corru...
Highly available and resilient networks play a decisive role in today’s networked world. As network faults are inevitable and networks are becoming constantly intricate, finding...
Feng Liu, Antonis M. Hadjiantonis, Ha Manh Tran, M...