We present a number of optimization techniques to compute prefix sums on linked lists and implement them on multithreaded GPUs using CUDA. Prefix computations on linked structures ...
For regular, sparse, linear systems, like those derived from regular grids, using High Performance Fortran (HPF) for iterative solvers is straightforward. However, for irregular ma...
This paper describes PI/OT, a template-based parallel I/O system. In PI/OT, I/O streams have annotations associated with them that are external to the source code. These annotatio...
Ian Parsons, Jonathan Schaeffer, Duane Szafron, Ro...
Data decomposition is probably the most successful method for generating parallel programs. In this paper a general framework is described for the automatic generation of parallel...
Edwin M. R. M. Paalvast, Henk J. Sips, Arjan J. C....
Dueto diffuse nature of lightphotons, Diffuse Optical Tomography (DOT) image reconstruction is a challenging 3D problem with a relatively large number of unknowns and limited meas...
Murat Guven, Birsen Yazici, Xavier Intes, Britton ...