The performance requirements for contemporary microprocessors are increasing as rapidly as their number of applications grows. By accelerating the clock, performance can be gained...
The load/store queue (LSQ) is one of the most complex parts of contemporary processors. Its latency is critical for the processor performance and it is usually one of the processo...
In recent years, several approaches have been proposed to use profile information in compiler optimization. This profile information can be used at the source level to guide loo...
Masayo Haneda, Peter M. W. Knijnenburg, Harry A. G...
Traditional code optimizers have produced significant performance improvements over the past forty years. While promising avenues of research still exist, traditional static and p...
Jason Hiser, Naveen Kumar, Min Zhao, Shukang Zhou,...
High-performance multiprocessor systems built around out-of-order processors with aggressive branch predictors execute many memory references that turn out to be on a mispredicted...
Resit Sendag, Ayse Yilmazer, Joshua J. Yi, Augustu...