Sciweavers

ICS
2009
Tsinghua U.

Efficient high performance collective communication for the cell blade

13 years 2 months ago
Efficient high performance collective communication for the cell blade
This paper presents high-performance collective communication algorithms and implementations that exploit the unique architectural features of the Cell heterogeneous multicore processor. This paper specifically describes novel algorithms for the barrier, broadcast, reduce, all-reduce, and all-gather collective operations, and shows the efficiency of these by comparing them to the previous fastest known implementations of these operations targeting the Cell. The new implementations are faster than the published stateof-the-art, achieving up to 19.21 times the performance (95% reduction in latency) of the previous published collective communication work for the Cell [19, 25]. The results presented show performance both within a chip and across the two Cell chips on a Cell blade [10]. Categories and Subject Descriptors
Qasim Ali, Samuel P. Midkiff, Vijay S. Pai
Added 19 Feb 2011
Updated 19 Feb 2011
Type Journal
Year 2009
Where ICS
Authors Qasim Ali, Samuel P. Midkiff, Vijay S. Pai
Comments (0)