Two schemes proposed to cope with unrecoverable or latent media errors and enhance the reliability of RAID systems are examined. The first scheme is the established, widely used d...
Ilias Iliadis, Robert Haas, Xiao-Yu Hu, Evangelos ...
In this work we propose an online reliability tracking framework that utilizes a hybrid network of on-chip temperature and delay sensors together with a circuit reliability macrom...
This workshop provides a forum for an overview, project presentations, and discussion of the research fostered and funded initially by the NSF Next Generation Software (NGS) Progr...
Hybrid discrete-continuous models, such as Jump Markov Linear Systems, are convenient tools for representing many real-world systems; in the case of fault detection, discrete jumps...
Lars Blackmore, Askar Bektassov, Masahiro Ono, Bri...
A noise maker is a tool that seeds a concurrent program with conditional synchronization primitives (such as yield()) for the purpose of increasing the likelihood that a bug manif...