Sciweavers

HPCA
2008
IEEE

Supporting highly-decoupled thread-level redundancy for parallel programs

14 years 5 months ago
Supporting highly-decoupled thread-level redundancy for parallel programs
The continued scaling of device dimensions and the operating voltage reduces the critical charge and thus natural noise tolerance level of transistors. As a result, circuits can produce transient upsets that corrupt program execution and data. Redundant execution can detect and correct circuit errors on the fly. The increasing prevalence of multi-core architectures makes coarse-grain threadlevel redundancy (TLR) very attractive. While TLR has been extensively studied in the context of single-threaded applications, much less attention is paid to the design issues and tradeoffs of supporting parallel codes. In this paper, we propose a microarchitecture to efficiently support TLR for parallel codes. One of the main design goals is to support a large number of unverified instructions, so that long latencies in verification can be easily tolerated. Another important objective is to have a comprehensive coverage that includes not only the computation logic but also the coherence and consist...
M. Wasiur Rashid, Michael C. Huang
Added 01 Dec 2009
Updated 01 Dec 2009
Type Conference
Year 2008
Where HPCA
Authors M. Wasiur Rashid, Michael C. Huang
Comments (0)