Abstract. We extend the refinement calculus to permit the derivation of programs in the Bulk Synchronous Parallelism (BSP) style. This demonstrates that formal approaches developed...
We consider a certain class of parallel program segments in which the order of messages sent affects the completion time. We give characterization of these parallel program segmen...
Loop distribution is one of the most useful techniques to reduce the execution time of parallel applications. Traditionally, loop scheduling algorithms are implemented based on pa...
HPC scientific computational models are notoriously difficult to develop, debug, and maintain. The reasons for this are multifaceted — including difficulty of parallel programm...
Steve Quenette, Louis Moresi, P. D. Sunter, Bill F...