The performance skeleton of an application is a short running program whose execution time in any scenario reflects the estimated execution time of the application it represents....
This paper describes extensions to OpenMP that implement data placement features needed for NUMA architectures. OpenMP is a collection of compiler directives and library routines ...
John Bircsak, Peter Craig, RaeLyn Crowell, Zarka C...
A number of deterministic parallel programming models with strong safety guarantees are emerging, but similar support for nondeterministic algorithms, such as branch and bound sea...
Robert L. Bocchino Jr., Stephen Heumann, Nima Hona...
This paper discusses a program synthesis system to facilitate the generation of high-performance parallel programs for a class of computations encountered in quantum chemistry and...
Gerald Baumgartner, David E. Bernholdt, Daniel Coc...
We show in this paper how to evaluate the performance of skeleton-based high level parallel programs. Since many applications follow some commonly used algorithmic skeletons, we id...
Anne Benoit, Murray Cole, Stephen Gilmore, Jane Hi...