Sciweavers

421 search results - page 16 / 85
» An Intelligent Parallel Loop Scheduling for Parallelizing Co...
Sort
View
ICS
1995
Tsinghua U.
15 years 1 months ago
Optimum Modulo Schedules for Minimum Register Requirements
Modulo scheduling is an e cient technique for exploiting instruction level parallelism in a variety of loops, resulting in high performance code but increased register requirement...
Alexandre E. Eichenberger, Edward S. Davidson, San...
IPPS
1999
IEEE
15 years 1 months ago
Cascaded Execution: Speeding Up Unparallelized Execution on Shared-Memory Multiprocessors
Both inherently sequential code and limitations of analysis techniques prevent full parallelization of many applications by parallelizing compilers. Amdahl's Law tells us tha...
Ruth E. Anderson, Thu D. Nguyen, John Zahorjan
MICRO
1995
IEEE
217views Hardware» more  MICRO 1995»
15 years 1 months ago
Improving instruction-level parallelism by loop unrolling and dynamic memory disambiguation
Exploitation ofinstruction-levelparallelism is an ejfective mechanism for improving the performance of modern super-scalar/VLIW processors. Various software techniques can be appl...
Jack W. Davidson, Sanjay Jinturkar
ICS
2000
Tsinghua U.
15 years 1 months ago
Fast greedy weighted fusion
Loop fusion is important to optimizing compilers because it is an important tool in managing the memory hierarchy. By fusing loops that use the same data elements, we can reduce t...
Ken Kennedy
PLDI
2000
ACM
15 years 1 months ago
Exploiting superword level parallelism with multimedia instruction sets
Increasing focus on multimedia applications has prompted the addition of multimedia extensions to most existing general purpose microprocessors. This added functionality comes pri...
Samuel Larsen, Saman P. Amarasinghe