Sciweavers

18 search results - page 1 / 4
» DMTCP: Transparent checkpointing for cluster computations an...
Sort
View
IPPS
2009
IEEE
14 years 1 months ago
DMTCP: Transparent checkpointing for cluster computations and the desktop
DMTCP (Distributed MultiThreaded CheckPointing) is a transparent user-level checkpointing package for distributed applications. Checkpointing and restart is demonstrated for a wid...
Jason Ansel, Kapil Arya, Gene Cooperman
SC
2005
ACM
13 years 12 months ago
Transparent, Incremental Checkpointing at Kernel Level: a Foundation for Fault Tolerance for Parallel Computers
We describe the software architecture, technical features, and performance of TICK (Transparent Incremental Checkpointer at Kernel level), a system-level checkpointer implemented ...
Roberto Gioiosa, José Carlos Sancho, Song J...
CCGRID
2006
IEEE
14 years 15 days ago
Transparent Adaptive Library-Based Checkpointing for Master-Worker Style Parallelism
We present a transparent, system-level checkpointing solution for master-worker parallelism that automatically adapts, upon restart, to the number of processor nodes available. Th...
Gene Cooperman, Jason Ansel, Xiaoqin Ma
CLUSTER
2005
IEEE
14 years 1 days ago
Transparent Checkpoint-Restart of Distributed Applications on Commodity Clusters
We have created ZapC, a novel system for transparent coordinated checkpoint-restart of distributed network applications on commodity clusters. ZapC provides a thin virtualization ...
Oren Laadan, Dan B. Phung, Jason Nieh
CLUSTER
2005
IEEE
14 years 1 days ago
Minimizing the Network Overhead of Checkpointing in Cycle-harvesting Cluster Environments
Cycle-harvesting systems such as Condor have been developed to make desktop machines in a local area (which are often similar to clusters in hardware configuration) available as ...
Daniel Nurmi, John Brevik, Richard Wolski