Sciweavers

VEE
2012
ACM

SecondSite: disaster tolerance as a service

11 years 11 months ago
SecondSite: disaster tolerance as a service
This paper describes the design and implementation of SecondSite, a cloud-based service for disaster tolerance. SecondSite extends the Remus virtualization-based high availability system by allowing groups of virtual machines to be replicated across data centers over wide-area Internet links. The goal of the system is to commodify the property of availability, exposing it as a simple tick box when configuring a new virtual machine. To achieve this in the wide area, we have had to tackle the related issues of replication traffic bandwidth, reliable failure detection across geographic regions and traffic redirection over a wide-area network without compromising on transparency and consistency. Categories and Subject Descriptors D.4.5 [Operating Systems]: Reliability—Backup procedures, Checkpoint/restart, Fault-tolerance Keywords Wide Area Replication, Disaster Recovery
Shriram Rajagopalan, Brendan Cully, Ryan O'Connor,
Added 25 Apr 2012
Updated 25 Apr 2012
Type Journal
Year 2012
Where VEE
Authors Shriram Rajagopalan, Brendan Cully, Ryan O'Connor, Andrew Warfield
Comments (0)