Computational Grids are large scale computing system composed of geographically distributed resources (computers, storage etc.) owned by self interested agents or organizations. T...
Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current t...
Arun Babu Nagarajan, Frank Mueller, Christian Enge...
The recent improvements in workstation and interconnection network performance have popularized the clusters of off-the-shelf workstations. However, the usefulness of these cluste...
—Clusters and applications continue to grow in size while their mean time between failure (MTBF) is getting smaller. Checkpoint/Restart is becoming increasingly important for lar...
Memory-intensive applications often suffer from the poor performance of disk swapping when memory is inadequate. Remote memory sharing schemes, which provide a remote memory that ...