We present S, the first system to provide transparent, lowoverhead application record-replay and the ability to go live from replayed execution. S i...
Despite a large research effort, software distributed shared memory systems have not been widely used to run parallel applications across clusters of computers. The higher perform...
Abstract—Execution of applications on upcoming highperformance computing (HPC) systems introduces a variety of new challenges and amplifies many existing ones. These systems will...
Avneesh Pant, Hassan Jafri, Volodymyr V. Kindraten...
Efficient performance tuning of parallel programs is often hard. In this paper we describe an approach that uses a uni-processor execution of a multithreaded program as reference ...
Task graph scheduling has been found effective in performance prediction and optimization of parallel applications. A number of static scheduling algorithms have been proposed for...