Loop distribution is an integral part of transforming a sequential program into a parallel one. It is used extensively in parallelization,vectorization, and memory management. For...
Consider a parallel program with n processes and a synchronization granularity z. Consider also two multiprocessors: a multiprocessor with q processors and run-time reallocation o...
Lars Lundberg, Kamilla Klonowska, Magnus Broberg, ...
Many parallel applications from scientific computing use MPI collective communication operations to collect or distribute data. Since the execution times of these communication op...
We present design and implementation details as well as performance results for two new parallel checkpointing libraries developed by us for parallel MPI applications. The first o...
Clustered microarchitectures are an attractive alternative to large monolithic superscalar designs due to their potential for higher clock rates in the face of increasingly wire-d...
Rajeev Balasubramonian, Sandhya Dwarkadas, David H...