We describe the communication infrastructure (CI) for our fault-tolerant cluster middleware, which is optimized for two classes of communication: for the applications and for the ...
Ming Li, Wenchao Tao, Daniel Goldberg, Israel Hsu,...
Not considered in the analytical model of the plant, uncertainties always dramatically decrease the performance of the fault detection task in the practice. To cope better with thi...
Abbas Khosravi, Joaquim Armengol Llobet, Esteban R...
Abstract: This paper presents a notation for describing functional fault models, which may occur in memory devices. Using this notation, the space of all possible memory faults has...
Designing a distributed fault tolerance algorithm requires careful analysis of both fault models and diagnosis strategies. A system will fail if there are too many active faults, ...
This paper describes the use of fault tolerance in a multiagent system. Such an approach is based on the modeling of autonomous agents with planning capabilities. These capabiliti...