In this paper we present an exhaustive evaluation of the memory subsystem in a chip-multiprocessor (CMP) architecture composed of 16 cores. The characterization is performed making...
—Massively parallel scientific applications, running on extreme-scale supercomputers, produce hundreds of terabytes of data per run, driving the need for storage solutions to im...
Ramya Prabhakar, Sudharshan S. Vazhkudai, Youngjae...
Control independence has been put forward as a significant new source of instruction-level parallelism for future generation processors. However, its performance potential under p...
—During post-silicon processor debugging, we need to frequently capture and dump out the internal state of the processor. Since internal state constitutes all memory elements, th...
Anant Vishnoi, Preeti Ranjan Panda, M. Balakrishna...
Abstract In this paper, we focus on the problem of scheduling batches of identical task graphs on a heterogeneous platform, when the task graph consists in a tree. We rely on stead...