We describe a generic programming model to design collective communications on SMP clusters. The programming model utilizes shared memory for collective communications and overlap...
The advances in multicore technology and modern interconnects is rapidly accelerating the number of cores deployed in today’s commodity clusters. A majority of parallel applicat...
Amith R. Mamidala, Rahul Kumar, Debraj De, Dhabale...
Many parallel applications from scientific computing use MPI collective communication operations to collect or distribute data. Since the execution times of these communication op...
In this paper, we analyze restrictions of traditional communication performance models affecting the accuracy of analytical prediction of the execution time of collective communic...
Alexey L. Lastovetsky, Vladimir Rychkov, Maureen O...
In order to meet flexibility, performance and energy efficiency constraints, future SoC (System-on-Chip) designs will contain an increasing number of heterogeneous processor cor...
Andreas Wieferink, Rainer Leupers, Gerd Ascheid, H...