Input space splitting for OpenCL

4 years 9 months ago
Input space splitting for OpenCL
The performance of OpenCL programs suffers from memory and control flow divergence. Therefore, OpenCL compilers employ static analyses to identify non-divergent control flow and memory accesses in order to produce faster code. However, divergence is often input-dependent, hence can be observed for some, but not all inputs. In these cases, vectorizing compilers have to generate slow code because divergence can occur at run time. paper, we use a polyhedral abstraction to partition the input space of an OpenCL kernel. For each partition, divergence analysis produces more precise results i.e., it can classify more code parts as non-divergent. Consequently, specializing the kernel for the input space partitions allows for generating better SIMD code because of less divergence. We implemented our technique in an OpenCL driver for the AVX instruction set and evaluate it on a range of OpenCL benchmarks. We observe speed ups of up to 9× for irregular kernels over a state-of-the-art vectoriz...
Simon Moll, Johannes Doerfert, Sebastian Hack
Added 31 Mar 2016
Updated 31 Mar 2016
Type Journal
Year 2016
Where CC
Authors Simon Moll, Johannes Doerfert, Sebastian Hack
Comments (0)