On Chip Multiprocessors (CMP), it is common that multiple cores share certain levels of cache. The sharing increases the contention in cache and memory-to-chip bandwidth, further h...
Yunlian Jiang, Eddy Z. Zhang, Kai Tian, Xipeng She...
Due to shared cache contentions and interconnect delays, data prefetching is more critical in alleviating penalties from increasing memory latencies and demands on Chip-Multiproce...
Xudong Shi, Zhen Yang, Jih-Kwon Peir, Lu Peng, Yen...
Recent research in embedded computing indicates that packing multiple processor cores on the same die is an effective way of utilizing the ever-increasing number of transistors. T...
Reuse distance analysis has been proved promising in evaluating and predicting data locality for programs written in Fortran or C/C++. But its effect has not been examined for ap...
The increasing use of Multiprocessor Systems-on-Chip (MPSoCs) for high performance demands of embedded applications results in high power dissipation. The memory subsystem is a la...
Ilya Issenin, Erik Brockmeyer, Bart Durinck, Nikil...