Sciweavers

204 search results - page 1 / 41
» Fault-tolerant solutions for a MPI compute intensive applica...
Sort
View
HIPC
2007
Springer
13 years 11 months ago
A Scalable Asynchronous Replication-Based Strategy for Fault Tolerant MPI Applications
As computational clusters increase in size, their mean-time-to-failure reduces. Typically checkpointing is used to minimize the loss of computation. Most checkpointing techniques, ...
John Paul Walters, Vipin Chaudhary
GRID
2004
Springer
13 years 10 months ago
Phoenix: Making Data-Intensive Grid Applications Fault-Tolerant
A major hurdle facing data intensive grid applications is the appropriate handling of failures that occur in the grid-environment. Implementing the fault-tolerance transparently a...
George Kola, Tevfik Kosar, Miron Livny
IPPS
2005
IEEE
13 years 10 months ago
Combining FT-MPI with H2O: Fault-Tolerant MPI Across Administrative Boundaries
We observe increasing interest in aggregating geographically distributed, heterogeneous resources to perform large scale computations. MPI remains the most popular programming par...
Dawid Kurzyniec, Vaidy S. Sunderam
CLUSTER
2006
IEEE
13 years 11 months ago
FAIL-MPI: How Fault-Tolerant Is Fault-Tolerant MPI?
One of the topics of paramount importance in the development of Cluster and Grid middleware is the impact of faults since their occurrence in Grid infrastructures and in large-sca...
William Hoarau, Pierre Lemarinier, Thomas Hé...
PVM
2005
Springer
13 years 10 months ago
Scalable Fault Tolerant MPI: Extending the Recovery Algorithm
ct Fault Tolerant MPI (FT-MPI)[6] was designed as a solution to allow applications different methods to handle process failures beyond simple check-point restart schemes. The init...
Graham E. Fagg, Thara Angskun, George Bosilca, Jel...