Performance loss due to long-latency memory accesses can be reduced by servicing multiple memory accesses concurrently. The notion of generating and servicing long-latency cache m...
Moinuddin K. Qureshi, Daniel N. Lynch, Onur Mutlu,...
— Processor architectures with large instruction windows have been proposed to expose more instruction-level parallelism (ILP) and increase performance. Some of the proposed arch...
Isidro Gonzalez, Marco Galluzzi, Alexander V. Veid...
With the latest high-end computing nodes combining shared-memory multiprocessing with hardware multithreading, new scheduling policies are necessary for workloads consisting of mu...
Robert L. McGregor, Christos D. Antonopoulos, Dimi...
Power consumption and DRAM latencies are serious concerns in modern chip-multiprocessor (CMP or multi-core) based compute systems. The management of the DRAM row buffer can signif...
Kshitij Sudan, Niladrish Chatterjee, David Nellans...
Existing DRAM controllers employ rigid, non-adaptive scheduling and buffer management policies when servicing prefetch requests. Some controllers treat prefetch requests the same ...