We present a unified approach to locality optimization that employs both data and control transformations. Data transformations include changing the array layout in memory. Contr...
Over the last decade, Message Passing Interface (MPI) has become a very successful parallel programming environment for distributed memory architectures such as clusters. However, ...
In this paper we consider several hardware implementations of the general-purpose atomic primitives fetch and Φ, compare and swap, load linked, and store conditionalon large-scal...
Abstract. Partitioned Global Address Space (PGAS) languages offer an attractive, high-productivity programming model for programming large-scale parallel machines. PGAS languages, ...
Christopher Barton, Calin Cascaval, George Alm&aac...
On a distributed memory machine, hand-coded message passing leads to the most efficient execution, but it is difficult to use. Parallelizing compilers can approach the performance...