Sciweavers

IPPS
2010
IEEE
13 years 2 months ago
Overlapping computation and communication: Barrier algorithms and ConnectX-2 CORE-Direct capabilities
Abstract--This paper explores the computation and communication overlap capabilities enabled by the new CORE-Direct hardware capabilities introduced in the InfiniBand (IB) Host Cha...
Richard L. Graham, Stephen W. Poole, Pavel Shamis,...
PAAPP
2010
131views more  PAAPP 2010»
13 years 3 months ago
Accurately measuring overhead, communication time and progression of blocking and nonblocking collective operations at massive s
Accurate, reproducible and comparable measurement of the overheads, communication times and progression behavior of blocking and nonblocking collective operations is a complicated...
Torsten Hoefler, Timo Schneider, Andrew Lumsdaine
PC
2007
128views Management» more  PC 2007»
13 years 4 months ago
Optimizing a conjugate gradient solver with non-blocking collective operations
This paper presents a case study about the applicability and usage of non blocking collective operations. These operations provide the ability to overlap communication with computa...
Torsten Hoefler, Peter Gottschling, Andrew Lumsdai...
IJHPCN
2006
116views more  IJHPCN 2006»
13 years 4 months ago
Implications of application usage characteristics for collective communication offload
Abstract-- The performance of collective communication operations is known to have a significant impact on the scalability of some applications. Indeed, the global, synchronous nat...
Ron Brightwell, Sue Goudy, Arun Rodrigues, Keith D...
HPDC
2010
IEEE
13 years 6 months ago
LogGOPSim: simulating large-scale applications in the LogGOPS model
We introduce LogGOPSim--a fast simulation framework for parallel algorithms at large-scale. LogGOPSim utilizes a slightly extended version of the well-known LogGPS model in combin...
Torsten Hoefler, Timo Schneider, Andrew Lumsdaine
CLUSTER
2004
IEEE
13 years 8 months ago
Scalable, high-performance NIC-based all-to-all broadcast over Myrinet/GM
All-to-all broadcast is one of the common collective operations that involve dense communication between all processes in a parallel program. Previously, programmable Network Inte...
Weikuan Yu, Dhabaleswar K. Panda, Darius Buntinas
IPPS
1996
IEEE
13 years 9 months ago
ECO: Efficient Collective Operations for Communication on Heterogeneous Networks
PVM and other distributed computing systems have enabled the use of networks of workstations for parallel computation, but their approach of treating all networks as collections o...
Bruce Lowekamp, Adam Beguelin
PVM
1999
Springer
13 years 9 months ago
Implementing MPI-2 Extended Collective Operations
This paper describes a first approach to implement MPI-2’s Extended Collective Operations. We aimed to ascertain the feasibility and effectiveness of such a project based on exis...
Pedro V. Silva, João Gabriel Silva
IPPS
1999
IEEE
13 years 9 months ago
Flexible Collective Operations for Distributed Object Groups
Collective operations on multiple distributed objects are a powerful means to coordinate parallel computations. In this paper we present an inheritance based approach to implement ...
Jörg Nolte
IPPS
1999
IEEE
13 years 9 months ago
Optimization Rules for Programming with Collective Operations
We study how several collective operations like broadcast, reduction, scan, etc. can be composed efficiently in complex parallel programs. Our specific contributions are: (1) a fo...
Sergei Gorlatch, Christoph Wedler, Christian Lenga...