Sciweavers

IPPS
2010
IEEE

An auto-tuning framework for parallel multicore stencil computations

13 years 2 months ago
An auto-tuning framework for parallel multicore stencil computations
Although stencil auto-tuning has shown tremendous potential in effectively utilizing architectural resources, it has hitherto been limited to single kernel instantiations; in addition, the large variety of stencil kernels used in practice makes this computation pattern difficult to assemble into a library. This work presents a stencil auto-tuning framework that significantly advances programmer productivity by automatically converting a straightforward sequential Fortran 95 stencil expression into tuned parallel implementations in Fortran, C, or CUDA, thus allowing performance portability across diverse computer architectures, including the AMD Barcelona, Intel Nehalem, Sun Victoria Falls, and the latest NVIDIA GPUs. Results show that our generalized methodology delivers significant performance gains of up to 22
Shoaib Kamil, Cy Chan, Leonid Oliker, John Shalf,
Added 13 Feb 2011
Updated 13 Feb 2011
Type Journal
Year 2010
Where IPPS
Authors Shoaib Kamil, Cy Chan, Leonid Oliker, John Shalf, Samuel Williams
Comments (0)