Sciweavers

421 search results - page 6 / 85
» An Intelligent Parallel Loop Scheduling for Parallelizing Co...
Sort
View
ICPP
1999
IEEE
15 years 1 months ago
Access Descriptor Based Locality Analysis for Distributed-Shared Memory Multiprocessors
Most of today's multiprocessors have a DistributedShared Memory (DSM) organization, which enables scalability while retaining the convenience of the shared-memory programming...
Angeles G. Navarro, Rafael Asenjo, Emilio L. Zapat...
LCPC
2009
Springer
15 years 2 months ago
Unrolling Loops Containing Task Parallelism
Classic loop unrolling allows to increase the performance of sequential loops by reducing the overheads of the non-computational parts of the loop. Unfortunately, when the loop con...
Roger Ferrer, Alejandro Duran, Xavier Martorell, E...
IPPS
1997
IEEE
15 years 1 months ago
A Compile-Time Partitioning Strategy for Non-Rectangular Loop Nests
This paper presents a compile-time scheme for partitioning non-rectangular loop nests which consist of inner loops whose bounds depend on the index of the outermost, parallel loop...
Rizos Sakellariou
IPPS
1997
IEEE
15 years 1 months ago
A BSP Approach to the Scheduling of Tightly-Nested Loops
This paper addresses the scheduling of uniformdependence loop nests within the framework of the bulksynchronous parallel (BSP) model. Two broad classes of tightly-nested loops are...
Radu Calinescu
PLDI
1993
ACM
15 years 1 months ago
Global Optimizations for Parallelism and Locality on Scalable Parallel Machines
Data locality is critical to achievinghigh performance on large-scale parallel machines. Non-local data accesses result in communication that can greatly impact performance. Thus ...
Jennifer-Ann M. Anderson, Monica S. Lam