Sciweavers

24 search results - page 2 / 5
» Phoenix Project: Fault-Tolerant Applications
Sort
View
IJHPCA
2006
114views more  IJHPCA 2006»
13 years 5 months ago
Fault-Tolerant Scheduling of Fine-Grained Tasks in Grid Environments
Divide-and-conquer is a well-suited programming paradigm for parallel Grid applications. Our Satin system efficiently schedules the finegrained tasks of a divide-and-conquer appli...
Gosia Wrzesinska, Rob van Nieuwpoort, Jason Maasse...
HPCA
2007
IEEE
14 years 5 months ago
Evaluating MapReduce for Multi-core and Multiprocessor Systems
This paper evaluates the suitability of the MapReduce model for multi-core and multi-processor systems. MapReduce was created by Google for application development on data-centers...
Colby Ranger, Ramanan Raghuraman, Arun Penmetsa, G...
IPPS
2000
IEEE
13 years 9 months ago
Are COTS Suitable for Building Distributed Fault-Tolerant Hard Real-Time Systems?
For economic reasons, a new trend in the development of distributed hard real-time systems is to rely on the use of CommercialO -The-Shelf cots hardware and operating systems. As...
Pascal Chevochot, Antoine Colin, David Decotigny, ...
IPPS
2007
IEEE
13 years 11 months ago
The Design and Implementation of Checkpoint/Restart Process Fault Tolerance for Open MPI
To be able to fully exploit ever larger computing platforms, modern HPC applications and system software must be able to tolerate inevitable faults. Historically, MPI implementati...
Joshua Hursey, Jeffrey M. Squyres, Timothy Mattox,...
ICDCS
2012
IEEE
11 years 7 months ago
Combining Partial Redundancy and Checkpointing for HPC
Today’s largest High Performance Computing (HPC) systems exceed one Petaflops (1015 floating point operations per second) and exascale systems are projected within seven years...
James Elliott, Kishor Kharbas, David Fiala, Frank ...