We present a general scheme for virtualizing main memory errorcorrection mechanisms, which map redundant information needed to correct errors into the memory namespace itself. We ...
The use of large instruction windows coupled with aggressive out-oforder and prefetching capabilities has provided significant improvements in processor performance. In this paper...
The disparity between microprocessor clock frequencies and memory latency is a primary reason why many demanding applications run well below peak achievable performance. Software c...
Joseph Gebis, Leonid Oliker, John Shalf, Samuel Wi...
Efficient intra-node shared memory communication is important for High Performance Computing (HPC), especially with the emergence of multi-core architectures. As clusters continue ...
Adaptive computers combine conventional software programmable processors with reconfigurable compute units. We present techniques that allow the high-performance realization of de...