Sciweavers

1038 search results - page 54 / 208
» Distributed Fault Tolerant Controllers
Sort
View
IPPS
2007
IEEE
15 years 4 months ago
The Design and Implementation of Checkpoint/Restart Process Fault Tolerance for Open MPI
To be able to fully exploit ever larger computing platforms, modern HPC applications and system software must be able to tolerate inevitable faults. Historically, MPI implementati...
Joshua Hursey, Jeffrey M. Squyres, Timothy Mattox,...
IPPS
2000
IEEE
15 years 2 months ago
FANTOMAS: Fault Tolerance for Mobile Agents in Clusters
Abstract. To achieve an efficient utilization of cluster systems, a proper programming and operating environment is required. In this context, mobile agents are of growing interes...
Holger Pals, Stefan Petri, Claus Grewe
DSN
2004
IEEE
15 years 1 months ago
Improving System Dependability with Functional Alternatives
We present the concept of alternative functionality for improving dependability in distributed embedded systems. Alternative functionality is a mechanism that complements traditio...
Charles P. Shelton, Philip Koopman
SEC
2003
14 years 11 months ago
Security, Fault-Tolerance and their Verification for Ambient Systems
For the emerging ambient environments, in which interconnected intelligent devices will surround us to increase the comfort of our lives, fault tolerance and security are of paramo...
Jaap-Henk Hoepman
CCGRID
2010
IEEE
14 years 11 months ago
Selective Recovery from Failures in a Task Parallel Programming Model
Abstract--We present a fault tolerant task pool execution environment that is capable of performing fine-grain selective restart using a lightweight, distributed task completion tr...
James Dinan, Arjun Singri, P. Sadayappan, Sriram K...