If parallelism can be successfully exploited in a program, significant reductions in execution time can be achieved. However, if sections of the code are dominated by parallel ove...
We propose a tunable scheduling strategy that lies between FIFO and shortest- rst, based on the value of a coe cient Alpha. If Alpha is set to zero then this strategy is just FIFO....
Processors with write-through caches typically require a write buffer to hide the write latency to the next level of memory hierarchy and to reduce write traffic. A write buffer ...
We propose a method for compressing programs in embedded processors where instruction memory size dominates cost. A post-compilation analyzer examines a program and replaces commo...
Charles Lefurgy, Peter L. Bird, I-Cheng K. Chen, T...
As the issue widthof superscalar processors is increased, instructionfetch bandwidthrequirements will also increase. It will become necessary to fetch multiple basic blocks per cy...