Sciweavers

695 search results - page 11 / 139
» Cache based fault recovery for distributed systems
Sort
View

Publication
165views
13 years 3 months ago
Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault
In this paper, we propose a task scheduling algorithm for a multicore processor system which reduces the recovery time in case of a single fail-stop failure of a multicore processo...
Shohei Gotoda, Naoki Shibata and Minoru Ito

Presentation
324views
13 years 3 months ago
Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault
In this paper, we propose a task scheduling al-gorithm for a multicore processor system which reduces the recovery time in case of a single fail-stop failure of a multicore process...
ASPLOS
2009
ACM
15 years 10 months ago
ASSURE: automatic software self-healing using rescue points
Software failures in server applications are a significant problem for preserving system availability. We present ASSURE, a system that introduces rescue points that recover softw...
Stelios Sidiroglou, Oren Laadan, Carlos Perez, Nic...
PVM
2005
Springer
15 years 3 months ago
Scalable Fault Tolerant MPI: Extending the Recovery Algorithm
ct Fault Tolerant MPI (FT-MPI)[6] was designed as a solution to allow applications different methods to handle process failures beyond simple check-point restart schemes. The init...
Graham E. Fagg, Thara Angskun, George Bosilca, Jel...
ICSE
2003
IEEE-ACM
15 years 2 months ago
Supporting Dependable Distributed Applications Through a Component-Oriented Middleware-Based Group Service
Abstract. Dependable distributed applications require flexible infrastructure support for controlled redundancy, replication, and recovery of components and services. However, mos...
Katia B. Saikoski, Geoff Coulson