Sciweavers

4889 search results - page 334 / 978
» A Refactoring Approach to Parallelism
Sort
View
PVM
1998
Springer
15 years 8 months ago
SKaMPI: A Detailed, Accurate MPI Benchmark
Abstract. SKaMPI is a benchmark for MPI implementations. Its purpose is the detailed analysis of the runtime of individual MPI operations and comparison of these for di erent imple...
Ralf Reussner, Peter Sanders, Lutz Prechelt, Matth...
PPOPP
2010
ACM
16 years 1 months ago
Fast tridiagonal solvers on the GPU
We study the performance of three parallel algorithms and their hybrid variants for solving tridiagonal linear systems on a GPU: cyclic reduction (CR), parallel cyclic reduction (...
Yao Zhang, Jonathan Cohen, John D. Owens
PPOPP
2010
ACM
16 years 1 months ago
Structure-driven optimizations for amorphous data-parallel programs
Irregular algorithms are organized around pointer-based data structures such as graphs and trees, and they are ubiquitous in applications. Recent work by the Galois project has pr...
Mario Méndez-Lojo, Donald Nguyen, Dimitrios...
118
Voted
ICPP
2008
IEEE
15 years 11 months ago
XMT-GPU: A PRAM Architecture for Graphics Computation
The shading processors in graphics hardware are becoming increasingly general-purpose. We test, through simulation and benchmarking, the potential performance impact of replacing ...
Thomas M. DuBois, Bryant Lee, Yi Wang, Marc Olano,...
PDP
2008
IEEE
15 years 11 months ago
Scheduling of QR Factorization Algorithms on SMP and Multi-Core Architectures
This paper examines the scalable parallel implementation of QR factorization of a general matrix, targeting SMP and multi-core architectures. Two implementations of algorithms-by-...
Gregorio Quintana-Ortí, Enrique S. Quintana...