In this paper, we introduce an efficient technique for checkpointing multithreaded applications. Our approach makes use of processes constructed around the ARMOR (Adaptive Reconfi...
Keith Whisnant, Zbigniew Kalbarczyk, Ravishankar K...
Multiple threads running in a single, shared address space is a simple model for writing parallel programs for symmetric multiprocessor (SMP) machines and for overlapping I/O and ...
Transient faults that arise in large-scale software systems can often be repaired by re-executing the code in which they occur. Ascribing a meaningful semantics for safe re-execut...
Transient faults that arise in large-scale software systems can often be repaired by re-executing the code in which they occur. Ascribing a meaningful semantics for safe re-execut...
Workstation clusters are becoming an interesting alternative to dedicated multiprocessors. In this environment, the probability of a failure, during an application's executio...