In this paper we discuss issues related to the highperformance implementation of collective communications operations on distributed-memory computer architectures. Using a combina...
E. W. Chan, M. F. Heimlich, Avi Purkayastha, Rober...
We discuss the design and high-performance implementation of collective communications operations on distributed-memory computer architectures. Using a combination of known techni...
Ernie Chan, Marcel Heimlich, Avi Purkayastha, Robe...
Abstract. Most parallel systems on which MPI is used are now hierarchical: some processors are much closer to others in terms of interconnect performance. One of the most common su...
Hao Zhu, David Goodell, William Gropp, Rajeev Thak...
One of the most important collective communication patterns used in scientific applications is the complete exchange, also called All-to-All. Although efficient complete exchange ...
Many parallel applications from scientific computing use MPI global communication operations to collect or distribute data. Since the execution times of these communication opera...