Sciweavers

16159 search results - page 15 / 3232
» Parallel computing with CUDA
Sort
View
64
Voted
EGH
2009
Springer
14 years 7 months ago
Efficient stream compaction on wide SIMD many-core architectures
Stream compaction is a common parallel primitive used to remove unwanted elements in sparse data. This allows highly parallel algorithms to maintain performance over several proce...
Markus Billeter, Ola Olsson, Ulf Assarsson
106
Voted
PPOPP
2009
ACM
15 years 10 months ago
OpenMP to GPGPU: a compiler framework for automatic translation and optimization
GPGPUs have recently emerged as powerful vehicles for generalpurpose high-performance computing. Although a new Compute Unified Device Architecture (CUDA) programming model from N...
Seyong Lee, Seung-Jai Min, Rudolf Eigenmann
79
Voted
CEC
2010
IEEE
14 years 10 months ago
Evolving a CUDA kernel from an nVidia template
Rather than attempting to evolve a complete program from scratch we demonstrate genetic interface programming (GIP) by automatically generating a parallel CUDA kernel with identica...
William B. Langdon, Mark Harman
CC
2012
Springer
250views System Software» more  CC 2012»
13 years 5 months ago
Improving Performance of OpenCL on CPUs
Abstract. Data-parallel languages like OpenCL and CUDA are an important means to exploit the computational power of today’s computing devices. In this paper, we deal with two asp...
Ralf Karrenberg, Sebastian Hack
DEBS
2010
ACM
15 years 1 months ago
Evaluation of streaming aggregation on parallel hardware architectures
We present a case study parallelizing streaming aggregation on three different parallel hardware architectures. Aggregation is a performance-critical operation for data summarizat...
Scott Schneider, Henrique Andrade, Bugra Gedik, Ku...