Sciweavers

CLUSTER
2004
IEEE

MPI/FT: A Model-Based Approach to Low-Overhead Fault Tolerant Message-Passing Middleware

13 years 4 months ago
MPI/FT: A Model-Based Approach to Low-Overhead Fault Tolerant Message-Passing Middleware
Fault tolerance in parallel systems has traditionally been achieved through a combination of redundancy and checkpointing methods. This notion has also been extended to message-passing systems with user-transparent process checkpointing and message logging. Furthermore, studies of multiple types of rollback and recovery have been reported in literature, ranging from communication-induced checkpointing to pessimistic and synchronous solutions. However, many of these solutions incorporate high overhead because of their inability to utilize application level information.
Rajanikanth Batchu, Yoginder S. Dandass, Anthony S
Added 16 Dec 2010
Updated 16 Dec 2010
Type Journal
Year 2004
Where CLUSTER
Authors Rajanikanth Batchu, Yoginder S. Dandass, Anthony Skjellum, Murali Beddhu
Comments (0)