On modern computers, the performance of programs is often limited by memory latency rather than by processor cycle time. To reduce the impact of memory latency, the restructuring ...
Induprakas Kodukula, Keshav Pingali, Robert Cox, D...
Tile-based rendering (also called chunk rendering or bucket rendering) is a promising technique for low-power, 3D graphics platforms. This technique decomposes a scene into smalle...
Iosif Antochi, Ben H. H. Juurlink, Stamatis Vassil...
Server consolidation is becoming an increasingly popular technique to manage and utilize systems. This paper develops CMP memory systems for server consolidation where most sharin...
This paper presents a novel optimizing compiler for general purpose computation on graphics processing units (GPGPU). It addresses two major challenges of developing high performa...
Abstract. Traditional parallel programming methodologies for improving performance assume cache-based parallel systems. However, new architectures, like the IBM Cyclops-64 (C64), b...
Elkin Garcia, Ioannis E. Venetis, Rishi Khan, Guan...