Efficient determination of processing termination at barrier synchronization points can occupy an important role in the overall throughput of parallel and distributed computing sy...
We present a novel parallel algorithm for fast continuous collision detection (CCD) between deformable models using multi-core processors. We use a hierarchical representation to ...
We propose a generic design for Chinese remainder algorithms. A Chinese remainder computation consists in reconstructing an integer value from its residues modulo non coprime inte...
The ability to provide uniform shared-memory access to a significant number of processors in a single SMP node brings us much closer to the ideal PRAM parallel computer. In this pa...
David A. Bader, Ajith K. Illendula, Bernard M. E. ...
Minimizing communications when mapping affine loop nests onto distributed memory parallel computers has already drawn a lot of attention. This paper focuses on the next step: as i...