This paper describes the synchronization and communication primitives of the Cray T3E multiprocessor, a shared memory system scalable to 2048 processors. We discuss what we have l...
The ability to dynamically adapt an unstructured grid (or mesh) is a powerful tool for solving computational problems with evolving physical features; however, an efficient parall...
Rupak Biswas, Leonid Oliker, Sajal K. Das, Daniel ...
A new inter-processor communication architecture for chip multiprocessors is proposed which has a low area cost, flexible routing capability, and supports globally asynchronous loc...
We present a cross-layer customization methodology for latency and bandwidth efficient inter-core communication in embedded multiprocessors. The methodology integrates compiler, o...
This paper investigates the performance and power dissipation of Globally Asynchronous Locally Synchronous (GALS) multi-processor systems. We show that communication loops are a s...