Continuously shrinking feature sizes result in an increasing susceptibility of circuits to transient faults, e.g. due to environmental radiation. Approaches to implement fault tol...
Abstract. An important step in achieving robustness to run-time faults is the ability to detect and repair problems when they arise in a running system. Effective fault detection a...
Paulo Casanova, Bradley R. Schmerl, David Garlan, ...
Fault screeners are a new breed of fault identification technique that can probabilistically detect if a transient fault has affected the state of a processor. We demonstrate that...
Paul Racunas, Kypros Constantinides, Srilatha Mann...
In the research reported in this paper, transient faults were injected in the nodes and in the communication subsystem (by using software fault injection) of a commercial parallel...
Concurrent error detection (CED) based on time redundancy entails performing the normal computation and the re-computation at different times and then comparing their results. Time...