This paper describes a tiling technique that can be used by application programmers and optimizing compilers to obtain I/O-efficient versions of regular scientific loop nests. Du...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
—The problem of automatically generating hardware modules from a high level representation of an application has been at the research forefront in the last few years. In this pap...
Tiling, a key transformation for optimizing programs, has been widely studied in literature. Parameterized tiled code is important for auto-tuning systems since they often execute...
Muthu Manikandan Baskaran, Albert Hartono, Sanket ...
Processing and analyzing large volumes of data plays an increasingly important role in many domains of scienti c research. High-level language and compiler support for developing ...
We present a novel approach to ray tracing execution on commodity graphics hardware using CUDA. We decompose
a standard ray tracing algorithm into several data-parallel stages tha...