Distributed fault management and event notification are essential in Inter-AS Traffic Engineering (TE). In this paper we design and implement distributed fault management for WBEM...
ACE analysis is a technique to provide an early reliability estimate for microprocessors. ACE analysis couples data from performance models with low level design details to identi...
As technology scales ever further, device unreliability is creating excessive complexity for hardware to maintain the illusion of perfect operation. In this paper, we consider whe...
Marc de Kruijf, Shuou Nomura, Karthikeyan Sankaral...
As feature sizes shrink, transient failures of on-chip network links become a critical problem. At the same time, many applications require guarantees on both message arrival prob...
In this paper we present an approach to the scheduling and voltage scaling of low-power fault-tolerant hard real-time applications mapped on distributed heterogeneous embedded sys...