Sciweavers

108 search results - page 20 / 22
» Achieving efficient register allocation via parallelism
Sort
View
TOG
2008
106views more  TOG 2008»
14 years 9 months ago
BSGP: bulk-synchronous GPU programming
We present BSGP, a new programming language for general purpose computation on the GPU. A BSGP program looks much the same as a sequential C program. Programmers only need to supp...
Qiming Hou, Kun Zhou, Baining Guo
MICRO
2003
IEEE
143views Hardware» more  MICRO 2003»
15 years 3 months ago
VSV: L2-Miss-Driven Variable Supply-Voltage Scaling for Low Power
Energy-efficient processor design is becoming more and more important with technology scaling and with high performance requirements. Supply-voltage scaling is an efficient way to...
Hai Li, Chen-Yong Cher, T. N. Vijaykumar, Kaushik ...
CCGRID
2010
IEEE
14 years 10 months ago
FaReS: Fair Resource Scheduling for VMM-Bypass InfiniBand Devices
In order to address the high performance I/O needs of HPC and enterprise applications, modern interconnection fabrics, such as InfiniBand and more recently, 10GigE, rely on network...
Adit Ranadive, Ada Gavrilovska, Karsten Schwan
ICPP
2008
IEEE
15 years 4 months ago
Optimizing Issue Queue Reliability to Soft Errors on Simultaneous Multithreaded Architectures
The issue queue (IQ) is a key microarchitecture structure for exploiting instruction-level and thread-level parallelism in dynamically scheduled simultaneous multithreaded (SMT) p...
Xin Fu, Wangyuan Zhang, Tao Li, José A. B. ...
HPCA
2007
IEEE
15 years 10 months ago
Improving Branch Prediction and Predicated Execution in Out-of-Order Processors
If-conversion is a compiler technique that reduces the misprediction penalties caused by hard-to-predict branches, transforming control dependencies into data dependencies. Althou...
Eduardo Quiñones, Joan-Manuel Parcerisa, An...