Algorithms are developed for calculating dealiased linear convolution sums without the expense of conventional zero-padding or phase-shift techniques. For one-dimensional in-place ...
Abstract. We present the Operator Language (OL), a framework to automatically generate fast numerical kernels. OL provides the structure to extend the program generation system Spi...
Short vector SIMD instructions on recent microprocessors, such as SSE on Pentium III and 4, speed up code but are a major challenge to software developers. We present a compiler t...
In this paper we use a mathematical approach to automatically generate high performance short vector code for the discrete Fourier transform (DFT). We represent the well-known Coo...
our results using the Fast Fourier Transformation, the N-body attraction problem, and the cubic splines interpolation as examples.We investigate the application of partial evaluati...