Effective modeling and management of hardware resources have always been critical toward generating highly efficient code in static compilers. With Just-In-Time compilation and dy...
This paper analyzes an Intel Pentium 4 hyper-threading processor. The focus is to understand its performance and the underlying reasons behind that performance. Particular attenti...
Improving cache performance requires understanding cache behavior. However, measuring cache performance for one or two data input sets provides little insight into how cache behav...
This paper describes Compiler-Directed Content-Aware Prefetching (CDCAP), an integrated compiler and hardware approach for prefetching dynamic data structures. The approach utiliz...
Recent work has shown that multithreaded workloads running in execution-driven, full-system simulation environments cannot use instructions per cycle (IPC) as a valid performance ...