Mobile computing allows ubiquitous and continuousaccess to computing resources while the users travel or work at a client's site. The flexibility introduced by mobile computi...
Abstract. With the number of computing elements spiraling to hundred of thousands in modern HPC systems, failures are common events. Few applications are nevertheless fault toleran...
George Bosilca, Aurelien Bouteiller, Thomas H&eacu...
Fault-tolerant protocols, asynchronous and synchronous alike, make stationary fault assumptions: only a fraction f of the total n nodes may fail. Whilst a synchronous protocol is ...
Paulo Sousa, Nuno Ferreira Neves, Paulo Verí...
This paper describes the support provided for mobility and fault tolerance in Mykil, which is a key distribution protocol for large, secure group multicast. Mykil is based on a com...
— Fault tolerance in MPI becomes a main issue in the HPC community. Several approaches are envisioned from user or programmer controlled fault tolerance to fully automatic fault ...
Aurelien Bouteiller, Boris Collin, Thomas Hé...