This paper presents CMP Cooperative Caching, a unified framework to manage a CMP’s aggregate on-chip cache resources. Cooperative caching combines the strengths of private and ...
Performance loss due to long-latency memory accesses can be reduced by servicing multiple memory accesses concurrently. The notion of generating and servicing long-latency cache m...
Moinuddin K. Qureshi, Daniel N. Lynch, Onur Mutlu,...
Level one cache normally resides on a processor’s critical path, which determines the clock frequency. Directmapped caches exhibit fast access time but poor hit rates compared w...
Current simulation-sampling techniques construct accurate model state for each measurement by continuously warming large microarchitectural structures (e.g., caches and the branch...
Thomas F. Wenisch, Roland E. Wunderlich, Babak Fal...
Stride-based prefetching mechanisms exploit regular streams of memory accesses to hide memory latency. While these mechanisms are effective, they can be improved by studying the p...