The polyhedral model provides powerful abstractions to optimize loop nests with regular accesses. Affine transformations in this model capture a complex sequence of execution-reord...
We compare the performance of systems consisting of one large cluster containing q processors with systems where processors are grouped into k clusters containing u processors eac...
—The way the processes in a parallel program are scheduled on the processors of a multiprocessor system affects the performance significantly. Finding a schedule of processes to ...
We have taken a NIST molecular dynamics simulation program (md3), which was configured as a single sequential process running on a CRAY C90 vector supercomputer, and parallelized ...
Distributed programs are hard to write. A distributed debugger equipped with the mechanism to re-execute the traced computation in a controlled fashion can greatly facilitate the ...