Sciweavers

PVM
2009
Springer

VolpexMPI: An MPI Library for Execution of Parallel Applications on Volatile Nodes

13 years 11 months ago
VolpexMPI: An MPI Library for Execution of Parallel Applications on Volatile Nodes
The objective of this research is to convert ordinary idle PCs into virtual clusters for executing parallel applications. The paper introduces VolpexMPI that is designed to enable seamless forward application progress in the presence of frequent node failures as well as dynamically changing networks speeds and node execution speeds. Process replication is employed to provide robustness in such volatile environments. The central challenge in VolpexMPI design is to efficiently and automatically manage dynamically varying number of process replicas in different states of execution progress. The key fault tolerance technique employed is fully distributed sender based logging. The paper presents the design and a prototype implementation of VolpexMPI. Preliminary results validate that the overhead of providing robustness is modest for applications having a favorable ratio of communication to computation and a low degree of communication.
Troy LeBlanc, Rakhi Anand, Edgar Gabriel, Jaspal S
Added 27 May 2010
Updated 27 May 2010
Type Conference
Year 2009
Where PVM
Authors Troy LeBlanc, Rakhi Anand, Edgar Gabriel, Jaspal Subhlok
Comments (0)