Sciweavers

TPDS
2010

Dealing with Transient Faults in the Interconnection Network of CMPs at the Cache Coherence Level

12 years 11 months ago
Dealing with Transient Faults in the Interconnection Network of CMPs at the Cache Coherence Level
The importance of transient faults is predicted to grow due to current technology trends of increased scale of integration. One of the components that will be significantly affected by transient faults is the interconnection network of chip multiprocessors (CMPs). To deal efficiently with these faults and differently from other authors, we propose to use fault-tolerant cache coherence protocols that ensure the correct execution of programs when not all messages are correctly delivered. We describe the extensions made to a directory-based cache coherence protocol to provide fault tolerance and provide a modified set of token counting rules which are useful to design fault-tolerant token-based cache coherence protocols. We compare the directory-based fault-tolerant protocol with a tokenbased fault-tolerant one. We also show how to adjust the fault tolerance parameters to achieve the desired level of fault tolerance and measure the overhead achieved to be able to support very high fault r...
Ricardo Fernández Pascual, José M. G
Added 22 May 2011
Updated 22 May 2011
Type Journal
Year 2010
Where TPDS
Authors Ricardo Fernández Pascual, José M. García, Manuel E. Acacio, José Duato
Comments (0)