This paper presents a new approach for the execution of coarse-grain (tiled) parallel SPMD code for applications derived from the explicit discretization of 2-dimensional PDE prob...
Georgios I. Goumas, Nikolaos Drosinos, Vasileios K...
Irregular applications, which rely on pointer-based data structures, are often difficult to parallelize. The inputdependent nature of their execution means that traditional paral...
Targeted optimization of program segments can provide an additional program speedup over the highest default optimization level, such as -O3 in GCC. The key challenge is how to au...
Haiping Wu, Eunjung Park, Mihailo Kaplarevic, Ying...
The excessive complexity of both machine architectures and applications have made it difficult for compilers to statically model and predict application behavior. This observatio...
Qing Yi, Keith Seymour, Haihang You, Richard W. Vu...
With the increasing gap between processor speed and memory latency, the performance of data-dominated programs are becoming more reliant on fast data access, which can be improved...