Sciweavers

ICS
2009
Tsinghua U.

Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs

13 years 11 months ago
Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs
Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture, there are usually halo regions that need to be updated and exchanged among different processing elements (PEs). In addition, synchronization is often used to signal the completion of halo exchanges. Both communication and synchronization may incur significant overhead on parallel architectures with shared memory. This is especially true in the case of graphics processors (GPUs), which do not preserve the state of the per-core L1 storage across global synchronizations. To reduce these overheads, ghost zones can be created to replicate stencil operations, reducing communication and synchronization costs at the expense of redundantly computing some values on multiple PEs. However, the selection of the optimal ghost zone size depends on the characteristics of both the architecture and the application, and it ...
Jiayuan Meng, Kevin Skadron
Added 20 May 2010
Updated 20 May 2010
Type Conference
Year 2009
Where ICS
Authors Jiayuan Meng, Kevin Skadron
Comments (0)