Commodity computer clusters are often composed of hundreds of computing nodes. These generally off-the-shelf systems are not designed for high reliability. Node failures therefore...
This paper describes a novel approach to fault-tolerance in distributed object-based systems. It uses the fragmented-object model to integrate replication mechanisms into distribut...
This paper describes a highly available distributedvideo on demand (VoD) service which is inherently fault tolerant. The VoD service is provided by multiple servers that reside at...
Massively parallel computing systems are being built with thousands of nodes. Because of the high number of components, it is critical to keep these systems running even in the pre...
Designing cost-sensitive real-time control systems for safetycritical applications requires a careful analysis of the cost/coverage trade-offs of fault-tolerant solutions. This fu...
Claudio Pinello, Luca P. Carloni, Alberto L. Sangi...