We present a new cache oblivious scheme for iterative stencil computations that performs beyond system bandwidth limitations as though gigabytes of data could reside in an enormou...
Robert Strzodka, Mohammed Shaheen, Dawid Pajak, Ha...
Abstract. Current hardware trends place increasing pressure on programmers and tools to optimize scientific code. Numerous tools and techniques exist, but no single tool is a pana...
Abstract. The application field of static analysis techniques for objectoriented programming is getting broader, ranging from compiler optimizations to security issues. This leads...
Isabelle Pollet, Baudouin Le Charlier, Agostino Co...
For most parallel and high performance systems, tuning guides provide the users with advices to optimize the execution time of their programs. Execution time may be very sensitive...
This paper describes extensions to OpenMP that implement data placement features needed for NUMA architectures. OpenMP is a collection of compiler directives and library routines ...
John Bircsak, Peter Craig, RaeLyn Crowell, Zarka C...