Cache locality optimization is an efficient way for reducing the idle time of modern processors in waiting for needed data. This kind of optimization can be achieved either on the...
This paper describes Embra, a simulator for the processors, caches, and memory systems of uniprocessors and cache-coherent multiprocessors. When running as part of the SimOS simul...
Many parallel scientific applications need high-performance I/O. Unfortunately, end-to-end parallel-I/O performance has not been able to keep up with substantial improvements in p...
A high accuracy system for transistor-level static timing analysis is presented. Accurate static timing verification requires that individual gate and interconnect delays be accu...
Pawan Kulshreshtha, Robert Palermo, Mohammad Morta...
In this paper, we investigate methods for improving the hit rates in the first level of memory hierarchy. Particularly, we propose victim cache structures to reduce the number of ...
Gokhan Memik, Glenn Reinman, William H. Mangione-S...