This paper is the first extensive performance study of a recently proposed parallel programming model, called Concurrent Collections (CnC). In CnC, the programmer expresses her co...
In this paper, a comprehensive performance review of an MPI-based high-order three-dimensional spectral element method C++ toolbox is presented. The focus is put on the performance...
Christoph Bosshard, Roland Bouffanais, Christian C...
The shading processors in graphics hardware are becoming increasingly general-purpose. We test, through simulation and benchmarking, the potential performance impact of replacing ...
Thomas M. DuBois, Bryant Lee, Yi Wang, Marc Olano,...
We present a new cache oblivious scheme for iterative stencil computations that performs beyond system bandwidth limitations as though gigabytes of data could reside in an enormou...
Robert Strzodka, Mohammed Shaheen, Dawid Pajak, Ha...
The ability to automatically parallelize standard programming languages results in program portability across a wide range of machine architectures. It is the goal of the Polaris ...
William Blume, Rudolf Eigenmann, Keith Faigin, Joh...