Sciweavers

ISCA
2010
IEEE

Relax: an architectural framework for software recovery of hardware faults

13 years 9 months ago
Relax: an architectural framework for software recovery of hardware faults
As technology scales ever further, device unreliability is creating excessive complexity for hardware to maintain the illusion of perfect operation. In this paper, we consider whether exposing hardware fault information to software and allowing software to control fault recovery simplifies hardware design and helps technology scaling. The combination of emerging applications and emerging many-core architectures makes software recovery a viable alternative to hardware-based fault recovery. Emerging applications tend to have few I/O and memory side-effects, which limits the amount of information that needs checkpointing, and they allow discarding individual sub-computations with small qualitative impact. Software recovery can harness these properties in ways that hardware recovery cannot. We describe Relax, an architectural framework for software recovery of hardware faults. Relax includes three core components: (1) an ISA extension that allows software to mark regions of code for sof...
Marc de Kruijf, Shuou Nomura, Karthikeyan Sankaral
Added 10 Jul 2010
Updated 10 Jul 2010
Type Conference
Year 2010
Where ISCA
Authors Marc de Kruijf, Shuou Nomura, Karthikeyan Sankaralingam
Comments (0)