Sciweavers

2 search results - page 1 / 1
» Group-based Coordinated Checkpointing for MPI: A Case Study ...
Sort
View
55
Voted
ICPP
2007
IEEE
15 years 4 months ago
Group-based Coordinated Checkpointing for MPI: A Case Study on InfiniBand
Qi Gao, Wei Huang, Matthew J. Koop, Dhabaleswar K....
CLUSTER
2004
IEEE
15 years 1 months ago
Improved message logging versus improved coordinated checkpointing for fault tolerant MPI
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...