Recent developments in the field of object-based fault tolerance and the advent of the first OMG FTCORBA compliant middleware raise new requirements for the design process of dist...
In this paper, we analyze the fault tolerance of several bounded-degree networks that are commonly used for parallel computation. Among other things, we show that an N-node butterf...
Frank Thomson Leighton, Bruce M. Maggs, Ramesh K. ...
Massively parallel computing systems are being built with thousands of nodes. Because of the high number of components, it is critical to keep these systems running even in the pre...
Failure Mode and Effect Analysis (FMEA) is a method for assessing cause-consequence relations between component faults and hazards that may occur during the lifetime of a system. ...
Currently, some coarse measures like global network latency are used to compare routing protocols. These measures do not provide enough insight of traffic distribution among networ...