Reuse distance (i.e. LRU stack distance) precisely characterizes program locality and has been a basic tool for memory system research since the 1970s. However, the high cost of m...
Xipeng Shen, Jonathan Shaw, Brian Meeker, Chen Din...
This paper studies an interesting yet less explored behavior of memory access instructions, called access region locality. Unlike the traditional temporal and spatial data localit...
This paper presents a quantification of the timing effects that advanced processor features like data and instruction cache, pipelines, branch prediction units and out-oforder ex...
In this paper we describe our experience with Teapot [7], a domain-specific language for writing cache coherence protocols. Cache coherence is of concern when parallel and distrib...
Satish Chandra, James R. Larus, Michael Dahlin, Br...
For many applications, branch mispredictions and cache misses limit a processor’s performance to a level well below its peak instruction throughput. A small fraction of static i...