Modern applications demand support for a large number of clients and require large scale storage subsystems. This paper presents a theoretical model of prefetching and caching of s...
Recently there has been a surge of interest in developing performance debugging tools to help programmers tune their applications for better memory performance [2, 4, 10]. These t...
Margaret Martonosi, Anoop Gupta, Thomas E. Anderso...
Heterogeneous multicores, such as Cell BE processors and GPGPUs, typically do not have caches for their accelerator cores because coherence traffic, cache misses, and latencies fr...
In this paper, we investigate methods for improving the hit rates in the first level of memory hierarchy. Particularly, we propose victim cache structures to reduce the number of ...
Gokhan Memik, Glenn Reinman, William H. Mangione-S...
Object-based parallel and distributed applications are becoming increasingly popular, driven by the programmability advantages of component technology and a flat shared-object spa...