Sciweavers

DSN
2008
IEEE

Trace-based microarchitecture-level diagnosis of permanent hardware faults

13 years 11 months ago
Trace-based microarchitecture-level diagnosis of permanent hardware faults
As devices continue to scale, future shipped hardware will likely fail due to in-the-field hardware faults. As traditional redundancy-based hardware reliability solutions that tackle these faults will be too expensive to be broadly deployable, recent research has focused on low-overhead reliability solutions. One approach is to employ lowoverhead (“always-on”) detection techniques that catch high-level symptoms and pay a higher overhead for (rarely invoked) diagnosis. This paper presents trace-based fault diagnosis, a diagnosis strategy that identifies permanent faults in microarchitectural units by analyzing the faulty core’s instruction trace. Once a fault is detected, the faulty core is rolled back and re-executes from a previous checkpoint, generating a faulty instruction trace and recording the microarchitecture-level resource usage. A diagnosis process on another fault-free core then generates a fault-free trace which it compares with the faulty trace to identify the fau...
Man-Lap Li, Pradeep Ramachandran, Swarup Kumar Sah
Added 29 May 2010
Updated 29 May 2010
Type Conference
Year 2008
Where DSN
Authors Man-Lap Li, Pradeep Ramachandran, Swarup Kumar Sahoo, Sarita V. Adve, Vikram S. Adve, Yuanyuan Zhou
Comments (0)