While caches have become invaluable for higher-end architectures due to their ability to hide, in part, the gap between processor speed and memory access times, caches (and partic...
For many programs, especially integer codes, untolerated load instruction latencies account for a significant portion of total execution time. In this paper, we present the desig...
Todd M. Austin, Dionisios N. Pnevmatikatos, Gurind...
Smaller input data sets such as the test and the train input sets are commonly used in simulation to estimate the impact of architecture/micro-architecture features on the perform...
Wei-Chung Hsu, Howard Chen, Pen-Chung Yew, Dong-yu...
For many aspects of memory theoretical treatment already exists, in particular for: simple cache construction, store buers and store buer forwarding, cache coherence protocols, o...
Ulan Degenbaev, Wolfgang J. Paul, Norbert Schirmer
SARC merges cache controller and network interface functions by relying on a single hardware primitive: each access checks the tag and the state of the addressed line for possible...