As new computer architectures are developed to exploit large-scale data-level parallelism, techniques are needed to retarget legacy sequential code to these platforms. Sequential ...
While graphics processing units (GPUs) provide low-cost and efficient platforms for accelerating high performance computations, the tedious process of performance tuning required...
Mehrzad Samadi, Amir Hormati, Mojtaba Mehrara, Jan...
Parallel architectures are the way of the future, but are notoriously difficult to program. In addition to the low-level constructs they often present (e.g., locks, DMA, and non-...
Abstract. The recent parallel language standard for shared memory multiprocessor (SMP) machines, OpenMP, promises a simple and portable interface for programmers who wish to exploi...
Seung-Jai Min, Seon Wook Kim, Michael Voss, Sang I...
OpenCL is a programming language standard which enables the programmer to express the application by structuring its computation as kernels. The OpenCL compiler is given the explic...
Pekka O. Jaskelainen, Carlos S. de La Lama, Pablo ...