— This paper presents an adaptive causal model method (adaptive CMM) for fault diagnosis and recovery in complex multi-robot teams. We claim that a causal model approach is effec...
The DepAuDE architecture provides middleware to integrate fault tolerance support into distributed embedded automation applications. It allows error recovery to be expressed in te...
Geert Deconinck, Vincenzo De Florio, Ronnie Belman...
After a system crash, databases recover to the last committed transaction, but applications usually either crash or cannot continue. The Phoenix purpose is to enable application s...
The execution of job flow applications is a reality today in academic and industrial domains. In this paper, we propose an approach to adding self-healing behavior to the executio...
In this paper, we describe the design and implementation of two mechanisms for fault-tolerance and recovery for complex scientific workflows on computational grids. We present our ...