In a parallel system with multiple CPUs, one of the key problems is to assign loop iterations to processors. This problem, known as the loop scheduling problem, has been studied i...
Mahmut T. Kandemir, Taylan Yemliha, Seung Woo Son,...
Partitioning a memory into multiple blocks that can be independently accessed is a widely used technique to reduce its dynamic power. For embedded systems, its benefits can be ev...
Olga Golubeva, Mirko Loghi, Massimo Poncino, Enric...
There are two avenues for many-core machines to gain higher performance: increasing the number of processors, and increasing the number of vector units in one SIMD processor. A tru...
Xiaohuang Huang, Christopher I. Rodrigues, Stephen...
Caching techniques have been an efficient mechanism for mitigating the effects of the processor-memory speed gap. Traditional multi-level SRAM-based cache hierarchies, especially...
We present a methodology to debug a SOC by concentrating on its communication. Our extended communication model includes a) multiple signal groups per interface protocol at each I...