Increasingly scientific discoveries are driven by analyses of massively distributed bulk data. This has led to the proliferation of high-end mass storage systems, storage area clu...
— Checkpointing is an indispensable technique to provide fault tolerance for long-running high-throughput applications like those running on desktop grids. This paper argues that...
Samer Al-Kiswany, Matei Ripeanu, Sudharshan S. Vaz...
This paper presents new mechanisms for dynamic resource management in a cluster manager called Clusteron-Demand (COD). COD allocates servers from a common pool to multiple virtual...
Jeffrey S. Chase, David E. Irwin, Laura E. Grit, J...