Efficient performance tuning of parallel programs is often hard. In this paper we describe an approach that uses a uni-processor execution of a multithreaded program as reference ...
Architectures with software-writable parameters, or configurable architectures, enable runtime reconfiguration of computing platforms to the applications they execute. Such dynami...
This paper describes performance tuning experiences with a three-dimensional unstructured grid Euler flow code from NASA, which we have reimplemented in the PETSc framework and p...
William Gropp, Dinesh K. Kaushik, David E. Keyes, ...
Automatic performance tuning (auto-tuning) has been used in parallel numerical applications for adapting performance-relevant parameters. We extend auto-tuning to general-purpose ...
Christoph A. Schaefer, Victor Pankratius, Walter F...