Sciweavers

ICS
2010
Tsinghua U.

Speeding up Nek5000 with autotuning and specialization

13 years 6 months ago
Speeding up Nek5000 with autotuning and specialization
Autotuning technology has emerged recently as a systematic process for evaluating alternative implementations of a computation, in order to select the best-performing solution for a particular architecture. Specialization optimizes code customized to a particular class of input data set. In this paper, we demonstrate how compilerbased autotuning that incorporates specialization for expected data set sizes of key computations can be used to speed up Nek5000, a spectral-element code. Nek5000 makes heavy use of what are effectively Basic Linear Algebra Subroutine (BLAS) calls, but for very small matrices. Through autotuning and specialization, we can achieve significant performance gains over hand-tuned libraries (e.g., Goto, ATLAS, and ACML BLAS). Additional performance gains are obtained from using higher-level compiler optimizations that aggregate multiple BLAS calls. We demonstrate more than 2.2X performance gains on an Opteron over the original manually
Jaewook Shin, Mary W. Hall, Jacqueline Chame, Chun
Added 29 Sep 2010
Updated 29 Sep 2010
Type Conference
Year 2010
Where ICS
Authors Jaewook Shin, Mary W. Hall, Jacqueline Chame, Chun Chen, Paul F. Fischer, Paul D. Hovland
Comments (0)