Sciweavers

PVM
2005
Springer

New User-Guided and ckpt-Based Checkpointing Libraries for Parallel MPI Applications

13 years 10 months ago
New User-Guided and ckpt-Based Checkpointing Libraries for Parallel MPI Applications
We present design and implementation details as well as performance results for two new parallel checkpointing libraries developed by us for parallel MPI applications. The first one, a user-guided library requires from the programmer to support packing and unpacking code with an easy-to-use API using MPI constants. It uses MPI-2 collective I/O calls or a dedicated master process for checkpointing. The other version is a technically advanced parallel implementation of checkpointing based on the user-level ckpt library. It uses wrappers for MPI calls in the user program which enables to run a shadow MPI application just for communication purposes. Communication between original processes and the shadow MPI code is done via shared memory segments to which communication buffers are mapped. We present checkpoint/restart times for the two approaches and subversions proposed by us compared to an available LAMMPI/BLCR checkpointing solution for MPI applications. The performance of all the ver...
Pawel Czarnul, Marcin Fraczak
Added 28 Jun 2010
Updated 28 Jun 2010
Type Conference
Year 2005
Where PVM
Authors Pawel Czarnul, Marcin Fraczak
Comments (0)