Many parallel applications from scientific computing use MPI global communication operations to collect or distribute data. Since the execution times of these communication opera...
We build an analytical model for an application utilizing master-slave paradigm. In the model, only three architecture parameters are used: latency, bandwidth and flop rate. Instea...
Applications must scale well to make efficient use of even medium-scale parallel systems. Because scaling problems are often difficult to diagnose, there is a critical need for sc...
Nathan R. Tallent, John M. Mellor-Crummey, Michael...
Solving large, irregular graph problems efficiently is challenging. Current software systems and commodity multiprocessors do not support fine-grained, irregular parallelism wel...
Guojing Cong, Sreedhar B. Kodali, Sriram Krishnamo...
This paper describes a source to source compilation tool for optimizing MPI-based parallel applications. This tool is able to automatically apply a “prepushing” transformation...