Abstract. SKaMPI is a benchmark for MPI implementations. Its purpose is the detailed analysis of the runtime of individual MPI operations and comparison of these for di erent imple...
Ralf Reussner, Peter Sanders, Lutz Prechelt, Matth...
We study the performance of three parallel algorithms and their hybrid variants for solving tridiagonal linear systems on a GPU: cyclic reduction (CR), parallel cyclic reduction (...
Irregular algorithms are organized around pointer-based data structures such as graphs and trees, and they are ubiquitous in applications. Recent work by the Galois project has pr...
The shading processors in graphics hardware are becoming increasingly general-purpose. We test, through simulation and benchmarking, the potential performance impact of replacing ...
Thomas M. DuBois, Bryant Lee, Yi Wang, Marc Olano,...
This paper examines the scalable parallel implementation of QR factorization of a general matrix, targeting SMP and multi-core architectures. Two implementations of algorithms-by-...