As the number of cores and threads in manycore compute accelerators such as Graphics Processing Units (GPU) increases, so does the importance of on-chip interconnection network des...
Silicon technology will continue to provide an exponential increase in the availability of raw transistors. Effectively translating this resource into application performance, how...
Steven Swanson, Ken Michelson, Andrew Schwerin, Ma...
The caching behavior of multimedia applications has been described as having high instruction reference locality within small loops, very large working sets, and poor data cache p...
In an effort to push the envelope of system performance, microprocessor designs are continually exploiting higher levels of instruction-level parallelism, resulting in increasing ...
Several recent papers have proposed or analyzed optimal algorithms to route all-to-all personalizedcommunication (AAPC) over communication networks such as meshes, hypercubes and ...