—This paper presents an approach for the reliability-aware design optimization of real-time systems on multi-processor platforms. The optimization is based on an extension of wel...
Jia Huang, Jan Olaf Blech, Andreas Raabe, Christia...
Reliability is a major requirement for most safety-related systems. To meet this requirement, fault-tolerant techniques such as hardware replication and software re-execution are ...
Jia Huang, Jan Olaf Blech, Andreas Raabe, Christia...
This paper describes a new method for providingtransparent fault tolerance for parallel applications on a network of workstations. We have designed our method in the context of sh...
Fine-Grained Cycle Sharing (FGCS) systems aim at utilizing the large amount of idle computational resources available on the Internet. Such systems allow guest jobs to run on a ho...
An important problem in the ®eld of distributed systems is that of detecting the termination of a distributed computation. Distributed termination detection (DTD) is a dicult p...