Sciweavers

6 search results - page 1 / 2
» Fast GPGPU Data Rearrangement Kernels using CUDA
Sort
View
CORR
2010
Springer
107views Education» more  CORR 2010»
13 years 4 months ago
Fast GPGPU Data Rearrangement Kernels using CUDA
: Many high performance computing algorithms are bandwidth limited, hence the need for optimal data rearrangement kernels as well as their easy integration into the rest of the app...
Michael Bader, Hans-Joachim Bungartz, Dheevatsa Mu...
ASPLOS
2009
ACM
13 years 11 months ago
Performance analysis of accelerated image registration using GPGPU
This paper presents a performance analysis of an accelerated 2-D rigid image registration implementation that employs the Compute Unified Device Architecture (CUDA) programming e...
Peter Bui, Jay B. Brockman
CEC
2010
IEEE
13 years 5 months ago
Evolving a CUDA kernel from an nVidia template
Rather than attempting to evolve a complete program from scratch we demonstrate genetic interface programming (GIP) by automatically generating a parallel CUDA kernel with identica...
William B. Langdon, Mark Harman
PDP
2011
IEEE
12 years 8 months ago
Accelerating Parameter Sweep Applications Using CUDA
—This paper proposes a parallelization scheme for parameter sweep (PS) applications using the compute unified device architecture (CUDA). Our scheme focuses on PS applications w...
Masaya Motokubota, Fumihiko Ino, Kenichi Hagihara
IPPS
2010
IEEE
13 years 2 months ago
Inter-block GPU communication via fast barrier synchronization
The graphics processing unit (GPU) has evolved from a fixedfunction processor with programmable stages to a programmable processor with many fixed-function components that deliver...
Shucai Xiao, Wu-chun Feng