Large-scale systems like BlueGene/L are susceptible to a number of software and hardware failures that can affect system performance. Periodic application checkpointing is a commo...
The DAG-based task graph model has been found effective in scheduling for performance prediction and optimization of parallel applications. However the scheduling complexity and s...
Abstract—Parallel file systems are designed to mask the everincreasing gap between CPU and disk speeds via parallel I/O processing. While they have become an indispensable compo...
Grids are becoming a mission-critical component in research and industry. The services they provide are thus required to be highly available, contributing to the vision of the Gri...
Mark Silberstein, Gabriel Kliot, Artyom Sharov, As...
In the area of Grid computing, there is a growing need to process large amounts of data. To support this trend, we need to develop efficient parallel storage systems that can prov...