Sciweavers

DSN
2005
IEEE

ReStore: Symptom Based Soft Error Detection in Microprocessors

13 years 10 months ago
ReStore: Symptom Based Soft Error Detection in Microprocessors
Device scaling and large scale integration have led to growing concerns about soft errors in microprocessors. To date, in all but the most demanding applications, implementing parity and ECC for caches and other large, regular SRAM structures have been sufficient to stem the growing soft error tide. This will not be the case for long, and questions remain as to the best way to detect and recover from soft errors in the remainder of the processor — in particular, the less structured execution core. In this work, we propose the ReStore architecture, which leverages existing performance enhancing checkpointing hardware to recover from soft error events in a low cost fashion. Error detection in the ReStore architecture is novel: symptoms that hint at the presence of soft errors trigger restoration of a previous checkpoint. Example symptoms include exceptions, control flow misspeculations, and cache or translation look-aside buffer misses. Compared to conventional soft error detection ...
Nicholas J. Wang, Sanjay J. Patel
Added 24 Jun 2010
Updated 24 Jun 2010
Type Conference
Year 2005
Where DSN
Authors Nicholas J. Wang, Sanjay J. Patel
Comments (0)