Sciweavers

PLDI
1993
ACM

Global Optimizations for Parallelism and Locality on Scalable Parallel Machines

13 years 8 months ago
Global Optimizations for Parallelism and Locality on Scalable Parallel Machines
Data locality is critical to achievinghigh performance on large-scale parallel machines. Non-local data accesses result in communication that can greatly impact performance. Thus the mapping, or decomposition, of the computation and data onto the processors of a scalable parallel machine is a key issue in compiling programs for these architectures. This paper describes a compiler algorithm that automatically finds computation and data decompositions that optimize both parallelism and locality. This algorithm is designed for use with both distributed and shared address space machines. The scope of our algorithm is dense matrix computations where the array accesses are affine functions of the loop indices. Our algorithm can handle programs with general nestings of parallel and sequential loops. We present a mathematical framework that enables us to systematically derive the decompositions. Our algorithm can exploit parallelism in both fully parallelizable loops as well as loops that r...
Jennifer-Ann M. Anderson, Monica S. Lam
Added 10 Aug 2010
Updated 10 Aug 2010
Type Conference
Year 1993
Where PLDI
Authors Jennifer-Ann M. Anderson, Monica S. Lam
Comments (0)