Sciweavers

ARCS
2008
Springer

An Optimized ZGEMM Implementation for the Cell BE

13 years 6 months ago
An Optimized ZGEMM Implementation for the Cell BE
: The architecture of the IBM Cell BE processor represents a new approach for designing CPUs. The fast execution of legacy software has to stand back in order to achieve very high performance for new scientific software. The Cell BE consists of 9 independent cores and represents a new promising architecture for HPC systems. The programmer has to write parallel software that is distributed to the cores and executes subtasks of the program in parallel. The simplified Vector-CPU design achieves higher clock-rates and power efficiency and exhibits predictable behavior. But to exploit the capabilities of this upcoming CPU architecture it is necessary to provide optimized libraries for frequently used algorithms. The Basic Linear Algebra Subprograms (BLAS) provide functions that are crucial for many scientific applications. The routine ZGEMM, which computes a complex matrix
Timo Schneider, Torsten Hoefler, Simon Wunderlich,
Added 12 Oct 2010
Updated 12 Oct 2010
Type Conference
Year 2008
Where ARCS
Authors Timo Schneider, Torsten Hoefler, Simon Wunderlich, Torsten Mehlan, Wolfgang Rehm
Comments (0)