This paper describes Compiler-Directed Content-Aware Prefetching (CDCAP), an integrated compiler and hardware approach for prefetching dynamic data structures. The approach utiliz...
Recent work has shown that multithreaded workloads running in execution-driven, full-system simulation environments cannot use instructions per cycle (IPC) as a valid performance ...
In this paper, we study the effects of manipulating the architected direction of conditional branches. Through the use of statistical sampling, we find that about 40% of all dyna...
Improving cache performance requires understanding cache behavior. However, measuring cache performance for one or two data input sets provides little insight into how cache behav...
The processes of accessing a shared communication media have been extensively researched in the dependability and real-time area. For embedded systems, the primary approaches have...