Sciweavers

TPDS
2008

Extending the TokenCMP Cache Coherence Protocol for Low Overhead Fault Tolerance in CMP Architectures

13 years 4 months ago
Extending the TokenCMP Cache Coherence Protocol for Low Overhead Fault Tolerance in CMP Architectures
It is widely accepted that transient failures will appear more frequently in chips designed in the near future due to several factors such as the increased integration scale. On the other hand, chip multiprocessors (CMPs) that integrate several processor cores in a single chip are nowadays the best alternative to more efficient use of the increasing number of transistors that can be placed in a single die. Hence, it is necessary to design new techniques to deal with these faults to be able to build sufficiently reliable CMPs. In this work, we present a coherence protocol aimed at dealing with transient failures that affect the interconnection network of a CMP, thus assuming that the network is no longer reliable. In particular, our proposal extends a token-based cache coherence protocol so that no data can be lost and no deadlock can occur due to any dropped message. Using the GEMS full-system simulator, we compare our proposal against a similar protocol without fault tolerance (TOKENC...
Ricardo Fernández Pascual, José M. G
Added 29 Dec 2010
Updated 29 Dec 2010
Type Journal
Year 2008
Where TPDS
Authors Ricardo Fernández Pascual, José M. García, Manuel E. Acacio, José Duato
Comments (0)