Utilizing the nonuniform latencies of SDRAM devices, access reordering mechanisms alter the sequence of main memory access streams to reduce the observed access latency. Using a r...
Previous proposals for power-aware thread-level parallelism on chip multiprocessors (CMPs) mostly focus on multiprogrammed workloads. Nonetheless, parallel computation of a single...
An integrated, hardware / software co-designed CISC processor is proposed and analyzed. The objectives are high performance and reduced complexity. Although the x86 ISA is targete...
Shiliang Hu, Ilhyun Kim, Mikko H. Lipasti, James E...
Due to increasing power densities, both on-chip average and peak temperatures are fast becoming a serious bottleneck in processor design. This is due to the cost of removing the h...
This paper proposes and evaluates hardware mechanisms for supporting prescient instruction prefetch--an approach to improving single-threaded application performance by using help...
Tor M. Aamodt, Paul Chow, Per Hammarlund, Hong Wan...