: In this paper, we apply Particle-based Volume Rendering (PBVR) technique using a current programmable GPU architecture. Recently, the increasing programmability of GPU offers an ...
Naohisa Sakamoto, Jorji Nonaka, Koji Koyamada, Sat...
Many programs exploit shared-memory parallelism using multithreading. Threaded codes typically use locks to coordinate access to shared data. In many cases, contention for locks r...
Nathan R. Tallent, John M. Mellor-Crummey, Allan P...
Code restructuring compilers rely heavily on program analysis techniques to automatically detect data dependences between program statements. Dependences between statement instanc...
Johnnie Birch, Robert A. van Engelen, Kyle A. Gall...
The memory access limits the performance of stream processors. By exploiting the reuse of data held in the Stream Register File (SRF), an on-chip storage, the number of memory acc...
Xuejun Yang, Ying Zhang, Jingling Xue, Ian Rogers,...
Application performance tuning is a complex process that requires assembling various types of information and correlating it with source code to pinpoint the causes of performance...
John M. Mellor-Crummey, Robert J. Fowler, David B....