Sciweavers

234 search results - page 2 / 47
» Optimal recovery schemes in fault tolerant distributed compu...
Sort
View
IPPS
2007
IEEE
13 years 11 months ago
Self Adaptive Application Level Fault Tolerance for Parallel and Distributed Computing
Most application level fault tolerance schemes in literature are non-adaptive in the sense that the fault tolerance schemes incorporated in applications are usually designed witho...
Zizhong Chen, Ming Yang, Guillermo A. Francia III,...
ICDCS
1999
IEEE
13 years 9 months ago
NAP: Practical Fault-Tolerance for Itinerant Computations
NAP, a detection and recovery based scheme for implementing fault-tolerant itinerant computations, is presented. We give the semantics for the scheme and describe a protocol that ...
Dag Johansen, Keith Marzullo, Fred B. Schneider, K...
IPPS
1998
IEEE
13 years 9 months ago
A Generalized Forward Recovery Checkpointing Scheme
We propose a generalized forward recovery checkpointing scheme, with lookahead execution and rollback validation. This method takes advantage of voting and comparison on multiple v...
Ke Huang, Jie Wu, Eduardo B. Fernández
IPPS
2007
IEEE
13 years 11 months ago
Fault-Tolerant Earliest-Deadline-First Scheduling Algorithm
The general approach to fault tolerance in uniprocessor systems is to maintain enough time redundancy in the schedule so that any task instance can be re-executed in presence of f...
Hakem Beitollahi, Seyed Ghassem Miremadi, Geert De...
ICPP
1987
IEEE
13 years 8 months ago
A Software-Based Hardware Fault Tolerance Scheme for Multicomputers
-- A hardware fault tolerance scheme for large multicomputers executing time-consuming non-interactive applications is described. Error detection and recovery are done mostly by so...
Yuval Tamir, Eli Gafni