Abstract. As multi-core processors proliferate, it has become more important than ever to ensure efficient execution of parallel jobs on multiprocessor systems. In this paper, we s...
To be competitive, Enterprises are collecting and analyzing increasingly large amount of data in order to derive business insights. However, there are at least two challenges to m...
Fast hardware turnover in supercomputing centers, stimulated by rapid technological progress, results in high heterogeneity among HPC platforms, and necessitates that applications...
Cooperative checkpointing uses global knowledge of the state and health of the machine to improve performance and reliability by dynamically deciding when to skip checkpoint reque...
A novel solution to the reliable multicast problem is the “digital fountain” approach, in which data is encoded with an erasure protection code before transmission, and receiv...