Graphics Processing Units (GPUs) have been growing in popularity due to their impressive processing capabilities, and with general purpose programming languages such as NVIDIA’s...
We design and implement Mars, a MapReduce framework, on graphics processors (GPUs). MapReduce is a distributed programming framework originally proposed by Google for the ease of ...
Bingsheng He, Wenbin Fang, Qiong Luo, Naga K. Govi...
The performance of irregular applications on modern computer systems is hurt by the wide gap between CPU and memory speeds because these applications typically underutilize multi-...
John M. Mellor-Crummey, David B. Whalley, Ken Kenn...
This paper describes a heterogeneous computer cluster called Axel. Axel contains a collection of nodes; each node can include multiple types of accelerators such as FPGAs (Field P...
We present a performance model-driven framework for automated performance tuning (autotuning) of sparse matrix-vector multiply (SpMV) on systems accelerated by graphics processing...