Replication is a key strategy for improving locality, fault tolerance and availability in distributed systems. The paper focuses on distributed file systems and presents a system ...
Replication is a key technique for improving fault tolerance. Replication can also improve application performance under some circumstances, but can have the opposite effect under...
ing Abstraction to Improve Fault Tolerance MIGUEL CASTRO Microsoft Research and RODRIGO RODRIGUES and BARBARA LISKOV MIT Laboratory for Computer Science Software errors are a major...
This paper presents a component model for building distributed applications with fault-tolerance requirements. The AFT-CCM model selects the configuration of replicated services d...
We consider storage in an extremely large-scale distributed computer system designed for stream processing applications. In such systems, incoming data and intermediate results ma...
Kirsten Hildrum, Fred Douglis, Joel L. Wolf, Phili...