Abstract. Grids reliability remains an order of magnitude below clusters on production infrastructures. This work is aimsed at improving grid application performances by improving ...
Diane Lingrand, Johan Montagnat, Janusz Martyniak,...
Resource reservations in advance are a mature concept for the allocation of various resources, particularly in grid environments. Common grid toolkits such as Globus support advanc...
Large-scale systems like BlueGene/L are susceptible to a number of software and hardware failures that can affect system performance. Periodic application checkpointing is a commo...
In systems consisting of multiple clusters of processors which employ space sharing for scheduling jobs, such as our Distributed ASCI1 Supercomputer (DAS), coallocation, i.e., the...
It is often difficult to perform efficiently a collection of jobs with complex job dependencies due to temporal unpredictability of the grid. One way to mitigate the unpredictabili...
Grzegorz Malewicz, Ian T. Foster, Arnold L. Rosenb...