Sciweavers

CGO
2010
IEEE

Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs

13 years 8 months ago
Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs
In this paper we describe techniques for compiling finegrained SPMD-threaded programs, expressed in programming models such as OpenCL or CUDA, to multicore execution platforms. Programs developed for manycore processors typically express finer thread-level parallelism than is appropriate for multicore platforms. We describe options for implementing fine-grained threading in software, and find that reasonable restrictions on the synchronization model enable significant optimizations and performance improvements over a baseline approach. We evaluate these techniques in a production-level compiler and runtime for the CUDA programming model targeting modern CPUs. Applications tested with our tool often showed performance parity with the compiled C version of the application for single-thread performance. With modest coarse-grained multithreading typical of today's CPU architectures, an average of 3.4
John A. Stratton, Vinod Grover, Jaydeep Marathe, B
Added 02 Aug 2010
Updated 02 Aug 2010
Type Conference
Year 2010
Where CGO
Authors John A. Stratton, Vinod Grover, Jaydeep Marathe, Bastiaan Aarts, Mike Murphy, Ziang Hu, Wen-mei W. Hwu
Comments (0)