Sciweavers

HPCA
2001
IEEE

Reducing DRAM Latencies with an Integrated Memory Hierarchy Design

14 years 5 months ago
Reducing DRAM Latencies with an Integrated Memory Hierarchy Design
In this papel; we address the severe performance gap caused by high processor clock rates and slow DRAM accesses. We show that even with an aggressive, next-generation memory system using four Direct Rambus channels and an integrated one-megabyte level-two cache, a processor still spends over half of its time stalling for L2 misses. Large cache blocks can improve performance, but only when coupled with wide memory channels. DRAM address mappings also affect performance significantly. We evaluate an aggressive prefetch unit integrated with the L2 cache and memory controllers. By issuing prefetches only when the Rambus channels are idle, prioritizing them to maximize DRAM row bufSer hits, and giving them low replacement priority, we achieve a 43% speedup across I O of the 26 SPEC2000 benchmarks, without degrading performance on the others. With eight Rambus channels, these ten benchmarks improve to within 10% of the peflormance of a perfect L2 cache.
Wei-Fen Lin, Steven K. Reinhardt, Doug Burger
Added 01 Dec 2009
Updated 01 Dec 2009
Type Conference
Year 2001
Where HPCA
Authors Wei-Fen Lin, Steven K. Reinhardt, Doug Burger
Comments (0)