Sciweavers

5 search results - page 1 / 1
» Fault-Tolerance, Malleability and Migration for Divide-and-C...
Sort
View
EUROPAR
2007
Springer
13 years 10 months ago
Persistent Fault-Tolerance for Divide-and-Conquer Applications on the Grid
Grid applications need to be fault tolerant, malleable, and migratable. In previous work, we have presented orphan saving, an efficient mechanism addressing these issues for divide...
Gosia Wrzesinska, Ana-Maria Oprescu, Thilo Kielman...
ICA3PP
2010
Springer
13 years 4 months ago
Checkpointing and Migration of Communication Channels in Heterogeneous Grid Environments
Abstract. A grid checkpointing service providing migration and transparent fault tolerance is important for distributed and parallel applications executed in heterogeneous grids. I...
John Mehnert-Spahn, Michael Schoettner
CCGRID
2008
IEEE
13 years 4 months ago
Fault Tolerance and Recovery of Scientific Workflows on Computational Grids
In this paper, we describe the design and implementation of two mechanisms for fault-tolerance and recovery for complex scientific workflows on computational grids. We present our ...
Gopi Kandaswamy, Anirban Mandal, Daniel A. Reed
HPCC
2010
Springer
13 years 4 months ago
A Generic Execution Management Framework for Scientific Applications
Managing the execution of scientific applications in a heterogeneous grid computing environment can be a daunting task, particularly for long running jobs. Increasing fault tolera...
Tanvire Elahi, Cameron Kiddle, Rob Simmonds