Sciweavers

PARA
2004
Springer

A Family of High-Performance Matrix Multiplication Algorithms

13 years 10 months ago
A Family of High-Performance Matrix Multiplication Algorithms
During the last half-decade, a number of research efforts have centered around developing software for generating automatically tuned matrix multiplication kernels. These include the PHiPAC project and the ATLAS project. The software endproducts of both projects employ brute force to search a parameter space for blockings that accommodate multiple levels of memory hierarchy. We take a different approach: using a simple model of hierarchical memories we employ mathematics to determine a locally-optimal strategy for blocking matrices. The theoretical results show that, depending on the shape of the matrices involved, different strategies are locally-optimal. Rather than determining a blocking strategy at library generation time, the theoretical results show that, ideally, one should pursue a heuristic that allows the blocking strategy to be determined dynamically at run-time as a function of the shapes of the operands. When the resulting family of algorithms is combined with a highly ...
John A. Gunnels, Fred G. Gustavson, Greg Henry, Ro
Added 02 Jul 2010
Updated 02 Jul 2010
Type Conference
Year 2004
Where PARA
Authors John A. Gunnels, Fred G. Gustavson, Greg Henry, Robert A. van de Geijn
Comments (0)