Sciweavers

Share
HIPC
2007
Springer
10 years 6 months ago
A Scalable Asynchronous Replication-Based Strategy for Fault Tolerant MPI Applications
As computational clusters increase in size, their mean-time-to-failure reduces. Typically checkpointing is used to minimize the loss of computation. Most checkpointing techniques, ...
John Paul Walters, Vipin Chaudhary
books