Sciweavers

35 search results - page 2 / 7
» Transparent checkpoints of closed distributed systems in Emu...
Sort
View
CLUSTER
2005
IEEE
13 years 11 months ago
Transparent Checkpoint-Restart of Distributed Applications on Commodity Clusters
We have created ZapC, a novel system for transparent coordinated checkpoint-restart of distributed network applications on commodity clusters. ZapC provides a thin virtualization ...
Oren Laadan, Dan B. Phung, Jason Nieh
EGC
2005
Springer
13 years 11 months ago
Transparent Fault Tolerance for Grid Applications
A major challenge facing grid applications is the appropriate handling of failures. In this paper we address the problem of making parallel Java applications based on Remote Method...
Pawel Garbacki, Bartosz Biskupski, Henri E. Bal
PODC
1994
ACM
13 years 10 months ago
A Checkpoint Protocol for an Entry Consistent Shared Memory System
Workstation clusters are becoming an interesting alternative to dedicated multiprocessors. In this environment, the probability of a failure, during an application's executio...
Nuno Neves, Miguel Castro, Paulo Guedes
IPPS
2005
IEEE
13 years 11 months ago
Current Practice and a Direction Forward in Checkpoint/Restart Implementations for Fault Tolerance
Checkpoint/restart is a general idea for which particular implementations enable various functionalities in computer systems, including process migration, gang scheduling, hiberna...
José Carlos Sancho, Fabrizio Petrini, Kei D...
CLUSTER
2003
IEEE
13 years 11 months ago
Coordinated Checkpoint versus Message Log for Fault Tolerant MPI
— Large Clusters, high availability clusters and Grid deployments often suffer from network, node or operating system faults and thus require the use of fault tolerant programmin...
Aurelien Bouteiller, Pierre Lemarinier, Gér...