Sciweavers

AC
1999
Springer

Enhancing Replica Management Services to Cope with Group Failures

13 years 8 months ago
Enhancing Replica Management Services to Cope with Group Failures
In a distributed system, replication of components, such as objects, is a well known way of achieving availability. For increased availability, crashed and disconnected components must be replaced by new components on available spare nodes. This replacement results in the membership of the replicated group 'walking' over a number of machines during system operation. In this context, we address the problem of reconfiguring a group after the group as an entity has failed. Such a failure is termed a group failure which, for example, can be the crash of every component in the group or the group being partitioned into minority islands. The solution assumes crash-proof storage, and eventual recovery of crashed nodes and healing of partitions. It guarantees that (i) the number of groups reconfigured after a group failure is never more than one, and (ii) the reconfigured group contains a majority of the components which were members of the group just before the group failure occurre...
Paul D. Ezhilchelvan, Santosh K. Shrivastava
Added 03 Aug 2010
Updated 03 Aug 2010
Type Conference
Year 1999
Where AC
Authors Paul D. Ezhilchelvan, Santosh K. Shrivastava
Comments (0)