The Reusable Software Fault Tolerance Testbed ReSoFT was developed to facilitate the development and evaluation of high-assurance systems that require tolerance of both hardware...
Kam S. Tso, Eltefaat Shokri, Roger J. Dziegiel Jr.
Middleware implementation of various critical services required by large-scale and complex real-time applications on top of COTS operating system is currently an approach of growi...
Eltefaat Shokri, Patrick Crane, K. H. Kim, Chittur...
In this paper, a task parallel application is implemented with Ninf-G which is a GridRPC system, and experimented on, using the Grid testbed in Asia Pacific, for three months. The...
Record and Replay (RR) is a software based state replication solution designed to support recording and subsequent replay of the execution of unmodified applications running on mu...
Philippe Bergheaud, Dinesh Subhraveti, Marc Vertes
The advent of deep sub-micron technology has exacerbated reliability issues in on-chip interconnects. In particular, single event upsets, such as soft errors, and hard faults are ...
Dongkook Park, Chrysostomos Nicopoulos, Jongman Ki...