Future scalable, high throughput, and high performance applications are likely to execute on platforms constructed by clustering multiple autonomous distributed servers, with reso...
Abstract. As data volumes processed by large-scale distributed dataintensive applications grow at high-speed, an increasing I/O pressure is put on the underlying storage service, w...
We describe a prototypical storage service through which we are addressing some of the open storage issues in wide-area distributed high-performance computing. We discuss some of t...
Checkpointing is a widely used mechanism for supporting fault tolerance, but notorious in its high-cost disk access. The idea of memory-based checkpointing has been extensively stu...
Distributed applications or workflows need to access and use compute, storage and network resources simultaneously or chronologically coordinated respectively. Examples are distri...