Despite extensive testing in the development phase, residual defects can be a great threat to dependability in the operational phase. This paper studies the utility of lowcost, ge...
— Fault tolerance in MPI becomes a main issue in the HPC community. Several approaches are envisioned from user or programmer controlled fault tolerance to fully automatic fault ...
Aurelien Bouteiller, Boris Collin, Thomas Hé...
Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current t...
Arun Babu Nagarajan, Frank Mueller, Christian Enge...
Abstract. Today’s parallel computers with SMP nodes provide both multithreading and message passing as their modes of parallel execution. As a consequence, performance analysis a...
This paper presents a framework that explicitly detects events in broadcasting baseball videos and facilitates the development of many practical applications. Three phases of contr...