Most application level fault tolerance schemes in literature are non-adaptive in the sense that the fault tolerance schemes incorporated in applications are usually designed witho...
Zizhong Chen, Ming Yang, Guillermo A. Francia III,...
We present a protocol for the distributed detection of garbage in a distributed system subject to common failures such as lost and duplicated messages, network partition, dismount...
Fault tolerance in parallel systems has traditionally been achieved through a combination of redundancy and checkpointing methods. This notion has also been extended to message-pas...
Rajanikanth Batchu, Yoginder S. Dandass, Anthony S...
It is widely accepted that transient failures will appear more frequently in chips designed in the near future due to several factors such as the increased integration scale. On t...
This paper proposes a new low power cache architecture that utilizes fault tolerance to allow aggressively reduced voltage levels. The fault tolerant overhead circuits consume lit...
Mohammad A. Makhzan, Amin Khajeh Djahromi, Ahmed M...