Sciweavers

695 search results - page 4 / 139
» Cache based fault recovery for distributed systems
Sort
View
IPPS
2010
IEEE
13 years 4 months ago
Supporting fault tolerance in a data-intensive computing middleware
Over the last 2-3 years, the importance of data-intensive computing has increasingly been recognized, closely coupled with the emergence and popularity of map-reduce for developin...
Tekin Bicer, Wei Jiang, Gagan Agrawal
CLUSTER
2004
IEEE
13 years 6 months ago
MPI/FT: A Model-Based Approach to Low-Overhead Fault Tolerant Message-Passing Middleware
Fault tolerance in parallel systems has traditionally been achieved through a combination of redundancy and checkpointing methods. This notion has also been extended to message-pas...
Rajanikanth Batchu, Yoginder S. Dandass, Anthony S...
SAC
2006
ACM
14 years 11 days ago
Proactive resilience through architectural hybridization
In a recent work, we have shown that it is not possible to dependably build any type of distributed f fault or intrusiontolerant system under the asynchronous model. This result f...
Paulo Sousa, Nuno Ferreira Neves, Paulo Verí...
WDAG
2010
Springer
230views Algorithms» more  WDAG 2010»
13 years 4 months ago
Implementing Fault-Tolerant Services Using State Machines: Beyond Replication
Abstract—This paper describes a method to implement faulttolerant services in distributed systems based on the idea of fused state machines. The theory of fused state machines us...
Vijay K. Garg
CORR
2008
Springer
81views Education» more  CORR 2008»
13 years 6 months ago
Proactive Service Migration for Long-Running Byzantine Fault Tolerant Systems
In this paper, we describe a proactive recovery scheme based on service migration for long-running Byzantine fault tolerant systems. Proactive recovery is an essential method for ...
Wenbing Zhao