Transient faults that arise in large-scale software systems can often be repaired by re-executing the code in which they occur. Ascribing a meaningful semantics for safe re-execut...
Using additional store-checkpoinsts (SCPs) and compare-checkpoints (CCPs), we present an adaptive checkpointing for double modular redundancy (DMR) in this paper. The proposed app...
Transactional memory systems promise to reduce the burden of exposing thread-level parallelism in programs by relieving programmers from analyzing complex inter-thread dependences...
Checkpointing is a commonly used approach to provide fault-tolerance and improve system dependability. However, using a constant and preconfigured checkpointing frequency may comp...