Sciweavers

1810 search results - page 31 / 362
» BASE: Using Abstraction to Improve Fault Tolerance
Sort
View
IPPS
2007
IEEE
15 years 4 months ago
A Framework for Experimental Validation and Performance Evaluation in Fault Tolerant Distributed System
Performing experimental evaluation of fault tolerant distributed systems is a complex and tedious task, and automating as much as possible of the execution and evaluation of exper...
Hein Meling
WADS
2009
Springer
223views Algorithms» more  WADS 2009»
15 years 4 months ago
Fault Tolerant External Memory Algorithms
Abstract. Algorithms dealing with massive data sets are usually designed for I/O-efficiency, often captured by the I/O model by Aggarwal and Vitter. Another aspect of dealing with ...
Gerth Stølting Brodal, Allan Grønlun...
CCGRID
2010
IEEE
14 years 10 months ago
Selective Recovery from Failures in a Task Parallel Programming Model
Abstract--We present a fault tolerant task pool execution environment that is capable of performing fine-grain selective restart using a lightweight, distributed task completion tr...
James Dinan, Arjun Singri, P. Sadayappan, Sriram K...
CLUSTER
2003
IEEE
15 years 3 months ago
Coordinated Checkpoint versus Message Log for Fault Tolerant MPI
— Large Clusters, high availability clusters and Grid deployments often suffer from network, node or operating system faults and thus require the use of fault tolerant programmin...
Aurelien Bouteiller, Pierre Lemarinier, Gér...
HICSS
2006
IEEE
118views Biometrics» more  HICSS 2006»
15 years 3 months ago
Exploiting Mobile Agents for Structured Distributed Software-Implemented Fault Injection
Embedded distributed real-time systems are traditionally used in safety-critical application areas such as avionics, healthcare, and the automotive sector. Assuring dependability ...
Thomas M. Galla, Karin Anna Hummel, Burkhard Peer