As the number of cores inside compute clusters continues to grow, the scalability of MPI (Message Passing Interface) is important to ensure that programs can continue to execute o...
Due to the complexity associated with developing parallel applications, scientists and engineers rely on highlevel software libraries such as PETSc, ScaLAPACK and PESSL to ease th...
Pavan Balaji, Darius Buntinas, Satish Balay, Barry...
MPI defines one-sided communication operations—put, get, and accumulate—together with three different synchronization mechanisms that define the semantics associated with th...
Abstract--This paper presents a GALS-compatible circuitswitched on-chip network that is well suited for use in many-core platforms targeting streaming DSP and embedded applications...
Per-core local (scratchpad) memories allow direct inter-core communication, with latency and energy advantages over coherent cache-based communication, especially as CMP architect...
Stamatis G. Kavadias, Manolis Katevenis, Michail Z...