Distributed information systems are critical to the functioning of many businesses; designing them to be dependable is a challenging but important task. We report our experience i...
Jeremy Bryans, John S. Fitzgerald, Alexander Roman...
Abstract. In recent years, there has been a surge of interest in Javabased volunteer computing systems, which aim to make it possible to build very large parallel computing network...
Record and Replay (RR) is a software based state replication solution designed to support recording and subsequent replay of the execution of unmodified applications running on mu...
Philippe Bergheaud, Dinesh Subhraveti, Marc Vertes
TPT-RAID is a multi-box RAID wherein each ECC group comprises at most one block Jrom any given storage box, and can thus tolerate a boxJailure. It extends the idea ojan out-oj-ban...
In a recent paper [2] we have proposed FT-TCP: an architecture that allows a replicated service to survive crashes without breaking its TCP connections. FT-TCP is attractive in pr...
Dmitrii Zagorodnov, Keith Marzullo, Lorenzo Alvisi...