This paper describes a framework for achieving node-level fault tolerance (NLFT) in distributed realtime systems. The objective of NLFT is to mask errors at the node level in orde...
As device scales shrink, higher transistor counts are available while soft-errors, even in logic, become a major concern. A new class of architectures, such as Merrimac and the IB...
Mattan Erez, Nuwan Jayasena, Timothy J. Knight, Wi...
- Partitionability allows the creation of many physically independent subsystems, each of which retains an identical functionality as its parent network and has no communication in...
Data aggregation plays an important role in the design of scalable systems, allowing the determination of meaningful system-wide properties to direct the execution of distributed a...
The Reusable Software Fault Tolerance Testbed ReSoFT was developed to facilitate the development and evaluation of high-assurance systems that require tolerance of both hardware...
Kam S. Tso, Eltefaat Shokri, Roger J. Dziegiel Jr.