To be able to fully exploit ever larger computing platforms, modern HPC applications and system software must be able to tolerate inevitable faults. Historically, MPI implementati...
Joshua Hursey, Jeffrey M. Squyres, Timothy Mattox,...
We present an efficient authenticated and fault-tolerant protocol (AFTD) for tree-based key agreement. Our approach is driven by the insight that when a Diffie-Hellman blinded key ...
A network G is called random-fault-tolerant (RFT) network for a network G if G contains a fault-free isomorphic copy of G with high probability even if each processor fails indepe...
A case study of performance and dependability evaluation of fault-tolerant multiprocessors is presented. Two specific architectures are analyzed taking into account system functio...
In environments like the Internet, faults follow unusual patterns, dictated by the combination of malicious attacks with accidental faults such as long communication delays caused...
Giuliana Santos Veronese, Miguel Correia, Lau Cheu...