Energy saving is becoming one of the major design issues in processor architectures with multiple functional units (FUs). Nested loops are usually the most critical part in multim...
Most hardware compilers apply loop pipelining to increase the parallelism achieved, but pipelining is restricted to the only innermost level in a nested loop. In this work we exten...
Kieron Turkington, Turkington A. Constantinides, K...
This paper deals with general nested loops and proposes a novel scheduling methodology for reducing the communication cost of parallel programs. General loops contain complex loop...
Florina M. Ciorba, Theodore Andronikos, Ioannis Dr...
The transition to multithreaded, multi-core designs places a greater responsibility on programmers and software for improving performance; thread-level parallelism (TLP) will be i...
Tipp Moseley, Daniel A. Connors, Dirk Grunwald, Ra...
Both inherently sequential code and limitations of analysis techniques prevent full parallelization of many applications by parallelizing compilers. Amdahl's Law tells us tha...