We investigate the well-known PRAM model of parallel computation as a practical parallel programming model. The two components of this project are a general-purpose PRAM programmin...
An important means of validating the design of commercial-grade shared memory multiprocessors is to run a large number of pseudo-random test programs on them. However, when intent...
Abstract. Loop fusion is a program transformation that merges multiple loops into one. It is e ective for reducing the synchronization overhead of parallel loops and for improving ...
On a distributed memory machine, hand-coded message passing leads to the most efficient execution, but it is difficult to use. Parallelizing compilers can approach the performance...
In this paper, we propose an inherent parallel scheme for 3D image segmentation of large volume data on a GPU cluster. This method originates from an extended Lattice Boltzmann Mod...