A challenging issue in today's server systems is to transparently deal with failures and application-imposed requirements for continuous operation. In this paper we address t...
Commercial designs are integrating from 10 to 100 embedded functional and storage blocks in a single system on chip (SoC) currently, and the number is likely to increase significa...
This paper describes the OFTT (OLE Fault Tolerance Technology), a fault tolerance middleware toolkit running on the Microsoft Windows NT operating system that provides required fa...
The ability to guarantee that a system will continue to operate correctly under degraded conditions is key to the success of adopting multi-agent systems (MAS) as a paradigm for d...
With rapid increase of parallel computation systems in their sizes, it is inevitable to develop algorithms that are applicable even if there exist faulty elements in the systems. ...