Sciweavers

ISCA
2002
IEEE

Transient-Fault Recovery Using Simultaneous Multithreading

13 years 9 months ago
Transient-Fault Recovery Using Simultaneous Multithreading
We propose a scheme for transient-fault recovery called Simultaneously and Redundantly Threaded processors with Recovery (SRTR) that enhances a previously proposed scheme for transient-fault detection, called Simultaneously and Redundantly Threaded (SRT) processors. SRT replicates an application into two communicating threads, one executing ahead of the other. The trailing thread repeats the computation performed by the leading thread, and the values produced by the two threads are compared. In SRT, a leading instruction may commit before the check for faults occurs, relying on the trailing thread to trigger detection. In contrast, SRTR must not allow any leading instruction to commit before checking occurs, since a faulty instruction cannot be undone once the instruction commits. To avoid stalling leading instructions at commit while waiting for their trailing counterparts, SRTR exploits the time between the completion and commit of leading instructions. SRTR compares the leading and...
T. N. Vijaykumar, Irith Pomeranz, Karl Cheng
Added 15 Jul 2010
Updated 15 Jul 2010
Type Conference
Year 2002
Where ISCA
Authors T. N. Vijaykumar, Irith Pomeranz, Karl Cheng
Comments (0)