High-performance processors use a large set–associative L1 data cache with multiple ports. As clock speeds and size increase such a cache consumes a significant percentage of t...
Dan Nicolaescu, Alexander V. Veidenbaum, Alexandru...
In this paper, we propose a fully automatic dynamic scratchpad memory (SPM) management technique for instructions. Our technique loads required code segments into the SPM on deman...
Bernhard Egger, Chihun Kim, Choonki Jang, Yoonsung...
Small area and code size are two critical design issues in most of embedded system designs. In this paper, we tackle these issues by customizing forwarding networks and instructio...
Swarnalatha Radhakrishnan, Hui Guo, Sri Parameswar...
The cache hierarchy design in existing SMT and superscalar processors is optimized for latency, but not for bandwidth. The size of the L1 data cache did not scale over the past dec...
Modern embedded microprocessors use low power on-chip memories called scratch-pad memories to store frequently executed instructions and data. Unlike traditional caches, scratch-p...
Rajiv A. Ravindran, Pracheeti D. Nagarkar, Ganesh ...