Sciweavers

115 search results - page 5 / 23
» Fusion of Loops for Parallelism and Locality
Sort
View
ICPP
1999
IEEE
15 years 1 months ago
Access Descriptor Based Locality Analysis for Distributed-Shared Memory Multiprocessors
Most of today's multiprocessors have a DistributedShared Memory (DSM) organization, which enables scalability while retaining the convenience of the shared-memory programming...
Angeles G. Navarro, Rafael Asenjo, Emilio L. Zapat...
CC
2008
Springer
193views System Software» more  CC 2008»
14 years 11 months ago
Automatic Transformations for Communication-Minimized Parallelization and Locality Optimization in the Polyhedral Model
The polyhedral model provides powerful abstractions to optimize loop nests with regular accesses. Affine transformations in this model capture a complex sequence of execution-reord...
Uday Bondhugula, Muthu Manikandan Baskaran, Sriram...
EUROPAR
2011
Springer
13 years 9 months ago
Model-Driven Tile Size Selection for DOACROSS Loops on GPUs
DOALL loops are tiled to exploit DOALL parallelism and data locality on GPUs. In contrast, due to loop-carried dependences, DOACROSS loops must be skewed first in order to make ti...
Peng Di, Jingling Xue
IPPS
2007
IEEE
15 years 3 months ago
Optimizing Inter-Nest Data Locality Using Loop Splitting and Reordering
With the increasing gap between processor speed and memory latency, the performance of data-dominated programs are becoming more reliant on fast data access, which can be improved...
Sofiane Naci
97
Voted
JISE
2002
165views more  JISE 2002»
14 years 9 months ago
Locality-Preserving Dynamic Load Balancing for Data-Parallel Applications on Distributed-Memory Multiprocessors
Load balancing and data locality are the two most important factors in the performance of parallel programs on distributed-memory multiprocessors. A good balancing scheme should e...
Pangfeng Liu, Jan-Jan Wu, Chih-Hsuae Yang