Sciweavers

131 search results - page 2 / 27
» Fault-Tolerant Replication Management in Large-Scale Distrib...
Sort
View
EUMAS
2006
13 years 6 months ago
DimaX: A Fault-Tolerant Multi-Agent Platform
Fault tolerance is an important property of large-scale multiagent systems as the failure rate grows with both the number of the hosts and deployed agents, and the duration of com...
Nora Faci, Zahia Guessoum, Olivier Marin
HIPC
2007
Springer
13 years 11 months ago
A Scalable Asynchronous Replication-Based Strategy for Fault Tolerant MPI Applications
As computational clusters increase in size, their mean-time-to-failure reduces. Typically checkpointing is used to minimize the loss of computation. Most checkpointing techniques, ...
John Paul Walters, Vipin Chaudhary
IEEESCC
2007
IEEE
13 years 11 months ago
A Fault-Tolerant Middleware Architecture for High-Availability Storage Services
Today organizations and business enterprises of all sizes need to deal with unprecedented amounts of digital information, creating challenging demands for mass storage and on-dema...
Sangeetha Seshadri, Ling Liu, Brian F. Cooper, Law...
WORDS
2003
IEEE
13 years 10 months ago
Decentralized Resource Management and Fault-Tolerance for Distributed CORBA Applications
Assigning an application’s fault-tolerance properties (e.g., replication style, checkpointing frequency) statically, and in an arbitrary manner, can lead to the application not ...
Carlos F. Reverte, Priya Narasimhan
WDAG
2010
Springer
230views Algorithms» more  WDAG 2010»
13 years 3 months ago
Implementing Fault-Tolerant Services Using State Machines: Beyond Replication
Abstract—This paper describes a method to implement faulttolerant services in distributed systems based on the idea of fused state machines. The theory of fused state machines us...
Vijay K. Garg