Sciweavers

458 search results - page 3 / 92
» Performance study of mapping irregular computations on GPUs
Sort
View
ICS
2009
Tsinghua U.
14 years 1 months ago
Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs
Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture...
Jiayuan Meng, Kevin Skadron
IPPS
2010
IEEE
13 years 4 months ago
Dynamic load balancing on single- and multi-GPU systems
The computational power provided by many-core graphics processing units (GPUs) has been exploited in many applications. The programming techniques currently employed on these GPUs...
Long Chen, Oreste Villa, Sriram Krishnamoorthy, Gu...
HPCC
2011
Springer
12 years 6 months ago
Heuristic-Based Techniques for Mapping Irregular Communication Graphs to Mesh Topologies
— Mapping of parallel applications on the network topology is becoming increasingly important on large supercomputers. Topology aware mapping can reduce the hops traveled by mess...
Abhinav Bhatele, Laxmikant V. Kalé
ICASSP
2011
IEEE
12 years 10 months ago
Real-time DVB-S2 LDPC decoding on many-core GPU accelerators
It is well known that LDPC decoding is computationally demanding and one of the hardest signal operations to parallelize. Beyond data dependencies that restrict the decoding of a ...
Gabriel Falcão Paiva Fernandes, Joao Andrad...
WWW
2009
ACM
14 years 7 months ago
Using graphics processors for high performance IR query processing
Web search engines are facing formidable performance challenges as they need to process thousands of queries per second over billions of documents. To deal with this heavy workloa...
Shuai Ding, Jinru He, Hao Yan, Torsten Suel