Sciweavers

18 search results - page 3 / 4
» DMTCP: Transparent checkpointing for cluster computations an...
Sort
View
IJHPCA
2006
117views more  IJHPCA 2006»
13 years 6 months ago
MPICH-V Project: A Multiprotocol Automatic Fault-Tolerant MPI
Abstract-- High performance computing platforms like Clusters, Grid and Desktop Grids are becoming larger and subject to more frequent failures. MPI is one of the most used message...
Aurelien Bouteiller, Thomas Hérault, G&eacu...
IJHPCA
2006
117views more  IJHPCA 2006»
13 years 6 months ago
Recent Developments in Gridsolve
The purpose of GridSolve is to create the middleware necessary to provide a seamless bridge between the simple, standard programming interfaces and desktop systems that dominate t...
Asim YarKhan, Keith Seymour, Kiran Sagi, Zhiao Shi...
IPPS
2000
IEEE
13 years 10 months ago
FANTOMAS: Fault Tolerance for Mobile Agents in Clusters
Abstract. To achieve an efficient utilization of cluster systems, a proper programming and operating environment is required. In this context, mobile agents are of growing interes...
Holger Pals, Stefan Petri, Claus Grewe
SC
2004
ACM
13 years 12 months ago
Kosha: A Peer-to-Peer Enhancement for the Network File System
This paper presents Kosha, a peer-to-peer (p2p) enhancement for the widely-used Network File System (NFS). Kosha harvests redundant storage space on cluster nodes and user desktop...
Ali Raza Butt, Troy A. Johnson, Yili Zheng, Y. Cha...
HPDC
2009
IEEE
14 years 1 months ago
Interconnect agnostic checkpoint/restart in open MPI
Long running High Performance Computing (HPC) applications at scale must be able to tolerate inevitable faults if they are to harness current and future HPC systems. Message Passi...
Joshua Hursey, Timothy Mattox, Andrew Lumsdaine