We study in this paper the problem of polyhedron scanning which appears for example when generating code for transformed loop nests in automatic parallelization. After a review of...
Hybrid branch predictors combine the predictions of multiple single-level or two-level branch predictors. The prediction-combining hardware -- the "meta-predictor" -may ...
Dirk Grunwald, Donald C. Lindsay, Benjamin G. Zorn
In order to achieve practical efficient execution on a parallel architecture, a knowledge of the data dependencies related to the application appears as the key point for building...
Determination of data dependences is a task typically performed with high-level language source code in today's optimizing and parallelizing compilers. Very little work has b...
Wolfram Amme, Peter Braun, Eberhard Zehendner, Fra...
Compile-time scheduling is one approach to extract parallelism which has proved effective when the execution behavior is predictable. Unfortunately, the performance of most priori...
The SGI Origin 2000 is designedto support a wide range of applications and has low local and remote memory latencies. However, it often has a high ratio of remote to local misses....
Global locality analysis is a technique for improving the cache performance of a sequence of loop nests through a combination of loop and data layout optimizations. Pure loop tran...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...