Sciweavers

OSDI
2004
ACM

FUSE: Lightweight Guaranteed Distributed Failure Notification

14 years 4 months ago
FUSE: Lightweight Guaranteed Distributed Failure Notification
FUSE is a lightweight failure notification service for building distributed systems. Distributed systems built with FUSE are guaranteed that failure notifications never fail. Whenever a failure notification is triggered, all live members of the FUSE group will hear a notification within a bounded period of time, irrespective of node or communication failures. In contrast to previous work on failure detection, the responsibility for deciding that a failure has occurred is shared between the FUSE service and the distributed application. This allows applications to implement their own definitions of failure. Our experience building a scalable distributed event delivery system on an overlay network has convinced us of the usefulness of this service. Our results demonstrate that the network costs of each FUSE group can be small; in particular, our overlay network implementation requires no additional liveness-verifying ping traffic beyond that already needed to maintain the overlay, making...
John Dunagan, Nicholas J. A. Harvey, Michael B. Jo
Added 03 Dec 2009
Updated 03 Dec 2009
Type Conference
Year 2004
Where OSDI
Authors John Dunagan, Nicholas J. A. Harvey, Michael B. Jones, Dejan Kostic, Marvin Theimer, Alec Wolman
Comments (0)