Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations