Sciweavers

115 search results - page 19 / 23
» Fusion of Loops for Parallelism and Locality
Sort
View
ICS
2001
Tsinghua U.
15 years 2 months ago
Computer aided hand tuning (CAHT): "applying case-based reasoning to performance tuning"
For most parallel and high performance systems, tuning guides provide the users with advices to optimize the execution time of their programs. Execution time may be very sensitive...
Antoine Monsifrot, François Bodin
ICS
1999
Tsinghua U.
15 years 1 months ago
Nonlinear array layouts for hierarchical memory systems
Programming languages that provide multidimensional arrays and a flat linear model of memory must implement a mapping between these two domains to order array elements in memory....
Siddhartha Chatterjee, Vibhor V. Jain, Alvin R. Le...
IPPS
1998
IEEE
15 years 1 months ago
Compiler-Optimization of Implicit Reductions for Distributed Memory Multiprocessors
This paper presents reduction recognition and parallel code generationstrategies for distributed-memorymultiprocessors. We describe techniques to recognize a broad range of implic...
Bo Lu, John M. Mellor-Crummey
HPDC
1996
IEEE
15 years 1 months ago
Customized Dynamic Load Balancing for a Network of Workstations
Load balancing involves assigning to each processor, work proportional to its performance, minimizing the execution time of the program. Althoughstatic load balancing can solve ma...
Mohammed Javeed Zaki, Wei Li, Srinivasan Parthasar...
IPPS
1995
IEEE
15 years 1 months ago
Index translation schemes for adaptive computations on distributed memory multicomputers
Current research in parallel programming is focused on closing the gap between globally indexed algorithms and the separate address spaces of processors on distributed memory mult...
Bongki Moon, Mustafa Uysal, Joel H. Saltz