GPGPUs have recently emerged as powerful vehicles for generalpurpose high-performance computing. Although a new Compute Unified Device Architecture (CUDA) programming model from N...
Parallel programming models based on a mixture of task and data parallelism have shown to be successful in addressing the increasing communication overhead of distributed memory p...
The Cray X1 was recently introduced as the first in a new line of parallel systems to combine high-bandwidth vector processing with an MPP system architecture. Alongside capabili...
Christian Bell, Wei-Yu Chen, Dan Bonachea, Katheri...
In this work, we propose a new FPGA design flow that combines the CUDA programming model from Nvidia with the state of the art high-level synthesis tool AutoPilot from AutoESL, to...
For more than thirty years, the parallel programming community has used the dependence graph as the main abstraction for reasoning about and exploiting parallelism in “regular...
Keshav Pingali, Donald Nguyen, Milind Kulkarni, Ma...