Multicore processors have not only reintroduced Non-Uniform Memory Access (NUMA) architectures in nowadays parallel computers, but they are also responsible for non-uniform access ...
In this paper we discuss our initial experiences adapting OpenMP to enable it to serve as a programming model for high performance embedded systems. A high-level programming model...
Barbara M. Chapman, Lei Huang, Eric Biscondi, Eric...
Recent developments in processing devices such as graphical processing units and multi-core systems offer opportunities to make use of parallel techniques at the chip level to obt...
Daniel P. Playne, Mitchell Johnson, Kenneth A. Haw...
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of searchbased performance optimizatio...
Samuel Williams, Jonathan Carter, Leonid Oliker, J...
Abstract. We investigate the performance of two approaches for matrix inversion based on Gaussian (LU factorization) and Gauss-Jordan eliminations. The target architecture is a cur...
Peter Benner, Pablo Ezzatti, Enrique S. Quintana-O...