Today’s general-purpose processors are increasingly using multithreading in order to better leverage the additional on-chip real estate available with each technology generation...
Ali El-Moursy, Rajeev Garg, David H. Albonesi, San...
This paper presents an automated performance tuning solution, which partitions a program into a number of tuning sections and finds the best combination of compiler options for e...
Detailed or cycle-accurate/bit-accurate (CABA) simulation is a critical phase in the design flow of embedded systems. However, with increasing system complexity, full detailed sim...
As chip multiprocessors (CMPs) become increasingly mainstream, architects have likewise become more interested in how best to share a cache hierarchy among multiple simultaneous t...
Lisa R. Hsu, Steven K. Reinhardt, Ravishankar R. I...
This paper investigates helper threads that improve performance by prefetching data on behalf of an application’s main thread. The focus is data prefetch helper threads that lac...