Sciweavers

11 search results - page 2 / 3
» A high-performance, low-power linear algebra core
Sort
View
EUROPAR
2009
Springer
13 years 10 months ago
Using Hybrid CPU-GPU Platforms to Accelerate the Computation of the Matrix Sign Function
Abstract. We investigate the performance of two approaches for matrix inversion based on Gaussian (LU factorization) and Gauss-Jordan eliminations. The target architecture is a cur...
Peter Benner, Pablo Ezzatti, Enrique S. Quintana-O...
ARCS
2008
Springer
13 years 7 months ago
An Optimized ZGEMM Implementation for the Cell BE
: The architecture of the IBM Cell BE processor represents a new approach for designing CPUs. The fast execution of legacy software has to stand back in order to achieve very high ...
Timo Schneider, Torsten Hoefler, Simon Wunderlich,...
PVM
2010
Springer
13 years 3 months ago
Massively Parallel Finite Element Programming
Abstract. Today’s large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data struc...
Timo Heister, Martin Kronbichler, Wolfgang Bangert...
IPPS
2009
IEEE
13 years 12 months ago
Singular value decomposition on GPU using CUDA
Linear algebra algorithms are fundamental to many computing applications. Modern GPUs are suited for many general purpose processing tasks and have emerged as inexpensive high per...
Sheetal Lahabar, P. J. Narayanan
TPDS
2010
174views more  TPDS 2010»
13 years 3 months ago
Parallel Two-Sided Matrix Reduction to Band Bidiagonal Form on Multicore Architectures
The objective of this paper is to extend, in the context of multicore architectures, the concepts of tile algorithms [Buttari et al., 2007] for Cholesky, LU, QR factorizations to t...
Hatem Ltaief, Jakub Kurzak, Jack Dongarra