Sciweavers

PVM
2010
Springer

Communication Target Selection for Replicated MPI Processes

13 years 2 months ago
Communication Target Selection for Replicated MPI Processes
Abstract. VolpexMPI is an MPI library designed for volunteer computing environments. In order to cope with the fundamental unreliability of these environments, VolpexMPI deploys two or more replicas of each MPI process. A receiver-driven communication scheme is employed to eliminate redundant message exchanges and sender based logging is employed to ensure seamless application progress with varying processor execution speeds and routine failures. In this model, to execute a receive operation, a decision has to be made as to which of the sending process replicas should be contacted first. Contacting the fastest replica appears to be the optimal local decision, but it can be globally non-optimal as it may slowdown the fastest replica. Further, identifying the fastest replica during execution is a challenge in itself. This paper evaluates various target selection algorithms to manage these trade-offs with the objective of minimizing the overall execution time. The algorithms are evaluat...
Rakhi Anand, Edgar Gabriel, Jaspal Subhlok
Added 30 Jan 2011
Updated 30 Jan 2011
Type Journal
Year 2010
Where PVM
Authors Rakhi Anand, Edgar Gabriel, Jaspal Subhlok
Comments (0)