Sciweavers

22 search results - page 1 / 5
» Fault Tolerance in Message Passing and in Action
Sort
View
CLUSTER
2004
IEEE
13 years 4 months ago
MPI/FT: A Model-Based Approach to Low-Overhead Fault Tolerant Message-Passing Middleware
Fault tolerance in parallel systems has traditionally been achieved through a combination of redundancy and checkpointing methods. This notion has also been extended to message-pas...
Rajanikanth Batchu, Yoginder S. Dandass, Anthony S...
CLUSTER
2006
IEEE
13 years 11 months ago
FAIL-MPI: How Fault-Tolerant Is Fault-Tolerant MPI?
One of the topics of paramount importance in the development of Cluster and Grid middleware is the impact of faults since their occurrence in Grid infrastructures and in large-sca...
William Hoarau, Pierre Lemarinier, Thomas Hé...
IJHPCA
2006
117views more  IJHPCA 2006»
13 years 5 months ago
MPICH-V Project: A Multiprotocol Automatic Fault-Tolerant MPI
Abstract-- High performance computing platforms like Clusters, Grid and Desktop Grids are becoming larger and subject to more frequent failures. MPI is one of the most used message...
Aurelien Bouteiller, Thomas Hérault, G&eacu...
CLUSTER
2002
IEEE
13 years 10 months ago
Design and Validation of Portable Communication Infrastructure for Fault-Tolerant Cluster Middleware
We describe the communication infrastructure (CI) for our fault-tolerant cluster middleware, which is optimized for two classes of communication: for the applications and for the ...
Ming Li, Wenchao Tao, Daniel Goldberg, Israel Hsu,...