This paper describes extensions to OpenMP that implement data placement features needed for NUMA architectures. OpenMP is a collection of compiler directives and library routines ...
John Bircsak, Peter Craig, RaeLyn Crowell, Zarka C...
In this paper, we analyze the efficiency of three local enumeration and three storage compression schemes for cyclic(m) data distributions in High Performance Fortran (HPF). We sh...
Abstract— We developed an automated environment to measure the memory access behavior of applications on high performance clusters. Code optimization for processor caches is cruc...
"Designing and Building Parallel Programs is a book for students and professionals who need to know how to write parallel programs. It is neither a programming language manual...
Hierarchical algorithms such as multigrid applications form an important cornerstone for scientific computing. In this study, we take a first step toward evaluating parallel lan...
Bradford L. Chamberlain, Steven J. Deitz, Lawrence...