Sciweavers

18 search results - page 1 / 4
» DMTCP: Transparent checkpointing for cluster computations an...
Sort
View
IPPS
2009
IEEE
13 years 11 months ago
DMTCP: Transparent checkpointing for cluster computations and the desktop
DMTCP (Distributed MultiThreaded CheckPointing) is a transparent user-level checkpointing package for distributed applications. Checkpointing and restart is demonstrated for a wid...
Jason Ansel, Kapil Arya, Gene Cooperman
SC
2005
ACM
13 years 10 months ago
Transparent, Incremental Checkpointing at Kernel Level: a Foundation for Fault Tolerance for Parallel Computers
We describe the software architecture, technical features, and performance of TICK (Transparent Incremental Checkpointer at Kernel level), a system-level checkpointer implemented ...
Roberto Gioiosa, José Carlos Sancho, Song J...
CCGRID
2006
IEEE
13 years 11 months ago
Transparent Adaptive Library-Based Checkpointing for Master-Worker Style Parallelism
We present a transparent, system-level checkpointing solution for master-worker parallelism that automatically adapts, upon restart, to the number of processor nodes available. Th...
Gene Cooperman, Jason Ansel, Xiaoqin Ma
CLUSTER
2005
IEEE
13 years 10 months ago
Transparent Checkpoint-Restart of Distributed Applications on Commodity Clusters
We have created ZapC, a novel system for transparent coordinated checkpoint-restart of distributed network applications on commodity clusters. ZapC provides a thin virtualization ...
Oren Laadan, Dan B. Phung, Jason Nieh
CLUSTER
2005
IEEE
13 years 10 months ago
Minimizing the Network Overhead of Checkpointing in Cycle-harvesting Cluster Environments
Cycle-harvesting systems such as Condor have been developed to make desktop machines in a local area (which are often similar to clusters in hardware configuration) available as ...
Daniel Nurmi, John Brevik, Richard Wolski