Sciweavers

18 search results - page 2 / 4
» DMTCP: Transparent checkpointing for cluster computations an...
Sort
View
PODC
1994
ACM
13 years 9 months ago
A Checkpoint Protocol for an Entry Consistent Shared Memory System
Workstation clusters are becoming an interesting alternative to dedicated multiprocessors. In this environment, the probability of a failure, during an application's executio...
Nuno Neves, Miguel Castro, Paulo Guedes
CLUSTER
2003
IEEE
13 years 10 months ago
Coordinated Checkpoint versus Message Log for Fault Tolerant MPI
— Large Clusters, high availability clusters and Grid deployments often suffer from network, node or operating system faults and thus require the use of fault tolerant programmin...
Aurelien Bouteiller, Pierre Lemarinier, Gér...
CLUSTER
2004
IEEE
13 years 9 months ago
Improved message logging versus improved coordinated checkpointing for fault tolerant MPI
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...
OSDI
2002
ACM
14 years 5 months ago
The Design and Implementation of Zap: A System for Migrating Computing Environments
We have created Zap, a novel system for transparent migration of legacy and networked applications. Zap provides a thin virtualization layer on top of the operating system that in...
Steven Osman, Dinesh Subhraveti, Gong Su, Jason Ni...
SOSP
2007
ACM
14 years 2 months ago
DejaView: a personal virtual computer recorder
As users interact with the world and their peers through their computers, it is becoming important to archive and later search the information that they have viewed. We present De...
Oren Laadan, Ricardo A. Baratto, Dan B. Phung, Sha...