Intra-iteration data reuse occurs when multiple array references exhibit data reuse in a single loop iteration. An optimizing compiler can exploit this reuse by clustering (in the...
Traditional list schedulers order instructions based on an optimistic estimate of the load latency imposed by the hardware and therefore cannot respond to variations in memory lat...
Compiler transformations can significantly improve data locality for many scientific programs. In this paper, we show iterative solvers for partial differential equations (PDEs) i...
Several existing compiler transformations can help improve communication-computation overlap in MPI applications. However, traditional compilers treat calls to the MPI library as ...
Anthony Danalis, Lori L. Pollock, D. Martin Swany,...
Bank locality can be defined as localizing the number of load/store accesses to a small set of memory banks at a given time. An optimizing compiler can modify a given input code t...
Guilin Chen, Mahmut T. Kandemir, Hendra Saputra, M...