We demonstrate Spiral, a domain-specific library generation system. Spiral generates high performance source code for linear transforms (such as the discrete Fourier transform and ...
Achieving peak performance in important numerical kernels such as dense matrix multiply or sparse-matrix vector multiplication usually requires extensive, machine-dependent tuning ...
Short vector SIMD instructions on recent microprocessors, such as SSE on Pentium III and 4, speed up code but are a major challenge to software developers. We present a compiler t...
We present a domain-specific approach to generate highperformance hardware-software partitioned implementations of the discrete Fourier transform (DFT) in fixed point precision....
Paolo D'Alberto, Peter A. Milder, Aliaksei Sandryh...