Search Sciweavers | Sciweavers

115 search results - page 3 / 23

» Transparent Fault Tolerance for Parallel Applications on Net...

click to vote

IPPS
2007
IEEE

102views Distributed And Parallel Com...» more IPPS 2007»

DejaVu: Transparent User-Level Checkpointing, Migration, and Recovery for Distributed Systems

13 years 11 months ago

Download www.cecs.uci.edu

In this paper, we present a new fault tolerance system called DejaVu for transparent and automatic checkpointing, migration, and recovery of parallel and distributed applications....

Joseph F. Ruscio, Michael A. Heffner, Srinidhi Var...

claim paper

Read More »

click to vote

CCGRID
2008
IEEE

132views Distributed And Parallel Com...» more CCGRID 2008»

Fault Tolerance and Recovery of Scientific Workflows on Computational Grids

13 years 5 months ago

Download xcr.cenit.latech.edu

In this paper, we describe the design and implementation of two mechanisms for fault-tolerance and recovery for complex scientific workflows on computational grids. We present our ...

Gopi Kandaswamy, Anirban Mandal, Daniel A. Reed

claim paper

Read More »

click to vote

IPPS
2000
IEEE

128views Distributed And Parallel Com...» more IPPS 2000»

Fault Tolerant Wide-Area Parallel Computing

13 years 8 months ago

Download ipdps.cc.gatech.edu

Executing parallel applications across distributed networks introduces the problem of fault tolerance. A viable solution for fault tolerance must keep overhead manageable and not c...

Jon B. Weissman

claim paper

Read More »

click to vote

IPPS
2003
IEEE

107views Distributed And Parallel Com...» more IPPS 2003»

A Low Cost Fault Tolerant Packet Routing for Parallel Computers

13 years 10 months ago

Download www.atc.unican.es

This work presents a new switching mechanism to tolerate arbitrary faults in interconnection networks with a negligible implementation cost. Although our routing technique can be ...

Valentin Puente, José A. Gregorio, Ram&oacu...

claim paper

Read More »

click to vote

CLUSTER
2003
IEEE

165views Distributed And Parallel Com...» more CLUSTER 2003»

Coordinated Checkpoint versus Message Log for Fault Tolerant MPI

13 years 10 months ago

Download www.cs.utk.edu

— Large Clusters, high availability clusters and Grid deployments often suffer from network, node or operating system faults and thus require the use of fault tolerant programmin...

Aurelien Bouteiller, Pierre Lemarinier, Gér...

claim paper

Read More »

« Prev « First page 3 / 23 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers