This paper presents the design and preliminary implementation of MpPVM, a software system that supports process migration for PVM application programs in a non-dedicated heterogen...
As the scale of high performance computing (HPC) continues to grow, application fault resilience becomes crucial. To address this problem, we are working on the design of an adapt...
In grid computing systems, providing fault-tolerance is required for both scientific computation and file-sharing to increase their reliability. In previous works, several mechani...
Sangho Yi, Derrick Kondo, Bongjae Kim, Geunyoung P...
As the scale of cluster computing grows, it is becoming hard for long-running applications to complete without facing failures on large-scale clusters. To address this issue, chec...
Malleability enables a parallel application’s execution system to split or merge processes modifying granularity. While process migration is widely used to adapt applications to...
Kaoutar El Maghraoui, Travis J. Desell, Boleslaw K...