As the performance gap between the CPU and main memory continues to grow, techniques to hide memory latency are essential to deliver a high performance computer system. Prefetchin...
One can e ectively utilize predicated execution to improve branch handling in instruction-level parallel processors. Although the potential bene ts of predicated execution are hig...
Scott A. Mahlke, Richard E. Hank, James E. McCormi...
One of the open questions in the design of multimedia storage servers is in what order to serve incoming requests. Given the capability provided by the disk layout and scheduling ...
The goal of this work is to explore architectural mechanisms for supporting explicit communication in cachecoherent shared memory multiprocessors. The motivation stems from the ob...
Reducing feature sizes and power supply voltage allows integrating more processing units (PUs) on multiprocessor system-on-chip (MPSoC) to satisfy the increasing demands of applic...
Yu Wang 0002, Jiang Xu, Shengxi Huang, Weichen Liu...