In this paper an optimized k-means implementation on the graphics processing unit (GPU) is presented. NVIDIA’s Compute Unified Device Architecture (CUDA), available from the G8...
In this paper we present a GPU-based multigrid approach for simulating elastic deformable objects in real time. Our method is based on a finite element discretization of the defo...
Although stencil auto-tuning has shown tremendous potential in effectively utilizing architectural resources, it has hitherto been limited to single kernel instantiations; in addi...
Shoaib Kamil, Cy Chan, Leonid Oliker, John Shalf, ...
Graphics processors (GPUs) provide a vast number of simple, data-parallel, deeply multithreaded cores and high memory bandwidths. GPU architectures are becoming increasingly progr...
Shuai Che, Michael Boyer, Jiayuan Meng, David Tarj...
— In this paper a novel implementation of the saliency map model on a multi-GPU platform using CUDA technology is presented. The saliency map model is a wellknown computational m...