Sciweavers

14 search results - page 1 / 3
» Minimizing the Network Overhead of Checkpointing in Cycle-ha...
Sort
View
CLUSTER
2005
IEEE
13 years 10 months ago
Minimizing the Network Overhead of Checkpointing in Cycle-harvesting Cluster Environments
Cycle-harvesting systems such as Condor have been developed to make desktop machines in a local area (which are often similar to clusters in hardware configuration) available as ...
Daniel Nurmi, John Brevik, Richard Wolski
HIPC
2007
Springer
13 years 11 months ago
A Scalable Asynchronous Replication-Based Strategy for Fault Tolerant MPI Applications
As computational clusters increase in size, their mean-time-to-failure reduces. Typically checkpointing is used to minimize the loss of computation. Most checkpointing techniques, ...
John Paul Walters, Vipin Chaudhary
CLUSTER
2003
IEEE
13 years 10 months ago
Coordinated Checkpoint versus Message Log for Fault Tolerant MPI
— Large Clusters, high availability clusters and Grid deployments often suffer from network, node or operating system faults and thus require the use of fault tolerant programmin...
Aurelien Bouteiller, Pierre Lemarinier, Gér...
ICDCS
2006
IEEE
13 years 11 months ago
Analysis of Clustering and Routing Overhead for Clustered Mobile Ad Hoc Networks
This paper presents an analysis of the control overhead involved in clustering and routing for one-hop clustered mobile ad hoc networks. Previous work on the analysis of control o...
Mingqiang Xue, Inn Inn Er, Winston Khoon Guan Seah
ICS
1999
Tsinghua U.
13 years 9 months ago
Fast cluster failover using virtual memory-mapped communication
This paper proposes a novel way to use virtual memorymapped communication (VMMC) to reduce the failover time on clusters. With the VMMC model, applications’ virtual address spac...
Yuanyuan Zhou, Peter M. Chen, Kai Li