Graphics processing units (GPUs) provide both memory bandwidth and arithmetic performance far greater than that available on CPUs but, because of their Single-Instruction-Multiple...
Lattice gas cellular automata (LGCA) models provide a relatively fast means of simulating fluid flow and can give both quantitative and qualitative insights into flow patterns aro...
Mitchel Johnson, Daniel P. Playne, Kenneth A. Hawi...
Abstract. The technique of flattening nested data parallelism combines all the independent operations in nested apply-to-all constructs and generates large amounts of potential pa...
Daniel W. Palmer, Jan Prins, Siddhartha Chatterjee...
Abstract. Nested data-parallel programs often have large memory requirements due to their high degree of parallelism. Piecewise execution is an implementation technique used to min...