Sciweavers

1186 search results - page 28 / 238
» The Communication in Intelligent Distributed Fault Tolerant ...
Sort
View
WDAG
2010
Springer
230views Algorithms» more  WDAG 2010»
14 years 8 months ago
Implementing Fault-Tolerant Services Using State Machines: Beyond Replication
Abstract—This paper describes a method to implement faulttolerant services in distributed systems based on the idea of fused state machines. The theory of fused state machines us...
Vijay K. Garg
DSN
2003
IEEE
15 years 3 months ago
Comparison of Failure Detectors and Group Membership: Performance Study of Two Atomic Broadcast Algorithms
Protocols that solve agreement problems are essential building blocks for fault tolerant distributed systems. While many protocols have been published, little has been done to ana...
Péter Urbán, Ilya Shnayderman, Andr&...
CCGRID
2006
IEEE
15 years 4 months ago
Proposal of MPI Operation Level Checkpoint/Rollback and One Implementation
With the increasing number of processors in modern HPC(High Performance Computing) systems, there are two emergent problems to solve. One is scalability, the other is fault tolera...
Yuan Tang, Graham E. Fagg, Jack Dongarra
SRDS
1991
IEEE
15 years 1 months ago
A Fault-Tolerant, Scalable, Low-Overhead Distributed Garbage Detection Protocol
We present a protocol for the distributed detection of garbage in a distributed system subject to common failures such as lost and duplicated messages, network partition, dismount...
Marc Shapiro