Sciweavers

SAIG
2000
Springer

Code Generators for Automatic Tuning of Numerical Kernels: Experiences with FFTW

13 years 8 months ago
Code Generators for Automatic Tuning of Numerical Kernels: Experiences with FFTW
Achieving peak performance in important numerical kernels such as dense matrix multiply or sparse-matrix vector multiplication usually requires extensive, machine-dependent tuning by hand. In response, a number automatic tuning systems have been developed which typically operate by (1) generating multiple implementations of a kernel, and (2) empirically selecting an optimal implementation. One such system is FFTW (Fastest Fourier Transform in the West) for the discrete Fourier transform. In this paper, we review FFTW's inner workings with an emphasis on its code generator, and report on our empirical evaluation of the system on two di erent hardware and compiler platforms. We then describe a number of our own extensions to the FFTW code generator that compute e cient discrete cosine transforms and show promising speed-ups over a vendor-tuned library. We also comment on current opportunities to develop tuning systems in the spirit of FFTW for other widely-used kernels.
Rich Vuduc, James Demmel
Added 25 Aug 2010
Updated 25 Aug 2010
Type Conference
Year 2000
Where SAIG
Authors Rich Vuduc, James Demmel
Comments (0)