We consider cluster systems with multiple nodes where each server is prone to run tasks at a degraded level of service due to some software or hardware fault. The cluster serves t...
The advent of networking technologies and high performance transport protocols facilitates the service of storage over networks. However, they pose challenges in integration and i...
Jiesheng Wu, Pete Wyckoff, Dhabaleswar K. Panda, R...
A Federated Array of Bricks is a scalable distributed storage system composed from inexpensive storage bricks. It achieves high reliability with low cost by using erasure coding a...
: A general purpose group communication protocol suite called Newtop is described. It is assumed that processes can simultaneously belong to many groups, group size could be large,...
We present a new approach to managing failures and evolution in large, complex distributed systems using runtime paths. We use the paths that requests follow as e through the syst...
Mike Y. Chen, Anthony Accardi, Emre Kiciman, David...