Sciweavers

ASPLOS
2006
ACM

Understanding prediction-based partial redundant threading for low-overhead, high- coverage fault tolerance

13 years 10 months ago
Understanding prediction-based partial redundant threading for low-overhead, high- coverage fault tolerance
Redundant threading architectures duplicate all instructions to detect and possibly recover from transient faults. Several lighter weight Partial Redundant Threading (PRT) architectures have been proposed recently. (i) Opportunistic Fault Tolerance duplicates instructions only during periods of poor single-thread performance. (ii) ReStore does not explicitly duplicate instructions and instead exploits mispredictions among highly confident branch predictions as symptoms of faults. (iii) Slipstream creates a reduced alternate thread by replacing many instructions with highly confident predictions. We explore PRT as a possible direction for achieving the fault tolerance of full duplication with the performance of single-thread execution. Opportunistic and ReStore yield partial coverage since they are restricted to using only partial duplication or only confident predictions, respectively. Previous analysis of Slipstream fault tolerance was cursory and concluded that only duplicated instr...
Vimal K. Reddy, Eric Rotenberg, Sailashri Parthasa
Added 13 Jun 2010
Updated 13 Jun 2010
Type Conference
Year 2006
Where ASPLOS
Authors Vimal K. Reddy, Eric Rotenberg, Sailashri Parthasarathy
Comments (0)