Sciweavers

IJHPCA
2006

Fault-Tolerant Scheduling of Fine-Grained Tasks in Grid Environments

13 years 4 months ago
Fault-Tolerant Scheduling of Fine-Grained Tasks in Grid Environments
Divide-and-conquer is a well-suited programming paradigm for parallel Grid applications. Our Satin system efficiently schedules the finegrained tasks of a divide-and-conquer application across multiple clusters in a grid. To accomodate long-running applications, we present a fault-tolerance mechanism for Satin that has negligible overhead during normal execution, while minimizing the amount of redundant work done after a crash of one or more nodes. We study the impact of our fault-tolerance mechanism on application efficiency, both on the Dutch DAS-2 system and using the European testbed of the EC-funded project GridLab.
Gosia Wrzesinska, Rob van Nieuwpoort, Jason Maasse
Added 12 Dec 2010
Updated 12 Dec 2010
Type Journal
Year 2006
Where IJHPCA
Authors Gosia Wrzesinska, Rob van Nieuwpoort, Jason Maassen, Thilo Kielmann, Henri E. Bal
Comments (0)