We present a case study parallelizing streaming aggregation on three different parallel hardware architectures. Aggregation is a performance-critical operation for data summarizat...
Scott Schneider, Henrique Andrade, Bugra Gedik, Ku...
Abstract—GPUs have recently been used to accelerate dataparallel applications for they provide easier programmability and increased generality while maintaining the tremendous me...
Ping Yao, Hong An, Mu Xu, Gu Liu, Xiaoqiang Li, Ya...
As the number of cores and threads in manycore compute accelerators such as Graphics Processing Units (GPU) increases, so does the importance of on-chip interconnection network des...
GPU-based heterogeneous clusters continue to draw attention from vendors and HPC users due to their high energy efficiency and much improved single-node computational performance...
Graphics processing units (GPUs) provide both memory bandwidth and arithmetic performance far greater than that available on CPUs but, because of their Single-Instruction-Multiple...