— Large Clusters, high availability clusters and Grid deployments often suffer from network, node or operating system faults and thus require the use of fault tolerant programmin...
This paper describes a research project about robot control across a computing Grid, first step toward a Grid solution for generic process control. A computational Grid can signi...
Abstract-- High performance computing platforms like Clusters, Grid and Desktop Grids are becoming larger and subject to more frequent failures. MPI is one of the most used message...
Grid applications need to be fault tolerant, malleable, and migratable. In previous work, we have presented orphan saving, an efficient mechanism addressing these issues for divide...
Transparent system support for software fault tolerance reduces performance in general and precludes application-specific optimizations in particular. In contrast, explicit support...