In contrast to the common belief that OpenMP requires data-parallel extensions to scale well on architectures with non-uniform memory access latency, recent work has shown that it ...
When vectorizing for SIMD architectures that are commonly employed by today’s multimedia extensions, one of the new challenges that arise is the handling of memory alignment. Pr...
: Data distribution is one of the key aspects that a parallelizing compiler for a distributed memory architecture should consider, in order to get efficiency from the system. The ...
This paper presents the design and the implementation of a compiler and runtime infrastructure for automatic program distribution. We are building a research infrastructure that e...
Roxana Diaconescu, Lei Wang, Zachary Mouri, Matt C...
—Analytical models have been used to estimate optimal values for parameters such as tile sizes in the context of loop nests. However, important algorithms such as fast Fourier tr...
Basilio B. Fraguela, Yevgen Voronenko, Markus P&uu...