Emergence of new parallel architectures presents new challenges for application developers. Supercomputers vary in processor speed, network topology, interconnect communication ch...
Abhinav Bhatele, Lukasz Wesolowski, Eric J. Bohm, ...
Existing supercomputers have hundreds of thousands of processor cores, and future systems may have hundreds of millions. Developers need detailed performance measurements to tune ...
Todd Gamblin, Bronis R. de Supinski, Martin Schulz...
The problem attacked in this paper is one of automatically mapping an application onto a Network-on-Chip (NoC) based chip multiprocessor (CMP) architecture in a locality-aware fas...
Guangyu Chen, Feihui Li, Seung Woo Son, Mahmut T. ...
Most modern Chip Multiprocessors (CMP) feature shared cache on chip. For multithreaded applications, the sharing reduces communication latency among co-running threads, but also r...
Recent proposals for Chip Multiprocessors (CMPs) advocate speculative, or implicit, threading in which the hardware employs prediction to peel off instruction sequences (i.e., imp...
Chong-liang Ooi, Seon Wook Kim, Il Park, Rudolf Ei...