Sciweavers

Share
PDP
2008
IEEE

Scheduling of QR Factorization Algorithms on SMP and Multi-Core Architectures

8 years 8 months ago
Scheduling of QR Factorization Algorithms on SMP and Multi-Core Architectures
This paper examines the scalable parallel implementation of QR factorization of a general matrix, targeting SMP and multi-core architectures. Two implementations of algorithms-by-blocks are presented. Each implementation views a block of a matrix as the fundamental unit of data, and likewise, operations over these blocks as the primary unit of computation. The first is a conventional blocked algorithm similar to those included in libFLAME and LAPACK but expressed in a way that allows operations in the so-called critical path of execution to be computed as soon as their dependencies are satisfied. The second algorithm captures a higher degree of parallelism with an approach based on Givens rotations while preserving the performance benefits of algorithms based on blocked Householder transformations. We show that the implementation effort is greatly simplified by expressing the algorithms in code with the FLAME/FLASH API, which allows matrices stored by blocks to be viewed and mana...
Gregorio Quintana-Ortí, Enrique S. Quintana
Added 01 Jun 2010
Updated 01 Jun 2010
Type Conference
Year 2008
Where PDP
Authors Gregorio Quintana-Ortí, Enrique S. Quintana-Ortí, Ernie Chan, Robert A. van de Geijn, Field G. Van Zee
Comments (0)
books