Sciweavers

PPOPP
1999
ACM

MagPIe: MPI's Collective Communication Operations for Clustered Wide Area Systems

13 years 8 months ago
MagPIe: MPI's Collective Communication Operations for Clustered Wide Area Systems
Writing parallel applications for computational grids is a challenging task. To achieve good performance, algorithms designed for local area networks must be adapted to the differences in link speeds. An important class of algorithms are collective operations, such as broadcast and reduce. We have developed MAGPIE, a library of collective communication operations optimized for wide area systems. MAGPIE’s algorithms send the minimal amount of data over the slow wide area links, and only incur a single wide area latency. Using our system, existing MPI applications can be run unmodified on geographically distributed systems. On moderate cluster sizes, using a wide area latency of 10 milliseconds and a bandwidth of 1 MByte/s, MAGPIE executes operations up to 10 times faster than MPICH, a widely used MPI implementation; application kernels improve by up to a factor of 4. Due to the structure of our algorithms, MAGPIE’s advantage increases for higher wide area latencies.
Thilo Kielmann, Rutger F. H. Hofman, Henri E. Bal,
Added 03 Aug 2010
Updated 03 Aug 2010
Type Conference
Year 1999
Where PPOPP
Authors Thilo Kielmann, Rutger F. H. Hofman, Henri E. Bal, Aske Plaat, Raoul Bhoedjang
Comments (0)