Input space splitting for OpenCL

8 years 27 days ago

Download compilers.cs.uni-saarland.de

The performance of OpenCL programs suffers from memory and control ﬂow divergence. Therefore, OpenCL compilers employ static analyses to identify non-divergent control ﬂow and memory accesses in order to produce faster code. However, divergence is often input-dependent, hence can be observed for some, but not all inputs. In these cases, vectorizing compilers have to generate slow code because divergence can occur at run time. paper, we use a polyhedral abstraction to partition the input space of an OpenCL kernel. For each partition, divergence analysis produces more precise results i.e., it can classify more code parts as non-divergent. Consequently, specializing the kernel for the input space partitions allows for generating better SIMD code because of less divergence. We implemented our technique in an OpenCL driver for the AVX instruction set and evaluate it on a range of OpenCL benchmarks. We observe speed ups of up to 9× for irregular kernels over a state-of-the-art vectoriz...

Simon Moll, Johannes Doerfert, Sebastian Hack

Real-time Traffic