Execution of MPI applications on Clusters and Grid deployments suffers from node and network failure that motivates the use of fault tolerant MPI implementations. Two category tec...
We have developed new methods for log-based recovery for middleware servers which involve thread pooling, private inmemory states for clients, shared in-memory state and message i...
ct Fault Tolerant MPI (FT-MPI)[6] was designed as a solution to allow applications different methods to handle process failures beyond simple check-point restart schemes. The init...
Graham E. Fagg, Thara Angskun, George Bosilca, Jel...
This paper presents a high-availability system architecture called INDRA — an INtegrated framework for Dependable and Revivable Architecture that enhances a multicore processor ...
Weidong Shi, Hsien-Hsin S. Lee, Laura Falk, Mrinmo...
Mobile ad hoc networks can be leveraged to provide ubiquitous services capable of acquiring, processing, and sharing real-time information from the physical world. Unlike Internet...