Exploitation of large amounts of instruction level parallelism requires a large amount of connectivity between the shared register file and the function units; this connectivity i...
Modulo scheduling is an e cient technique for exploiting instruction level parallelism in a variety of loops, resulting in high performance code but increased register requirement...
Alexandre E. Eichenberger, Edward S. Davidson, San...
An emerging trend in processor design is the addition of short vector instructions to general-purpose and embedded ISAs. Frequently, these extensions are employed using traditiona...
Samuel Larsen, Rodric M. Rabbah, Saman P. Amarasin...
Recently, the practice of speculation in resolving data dependences has been studied as a means of extracting more instruction level parallelism (ILP). An outcome of an instructio...
Automatic optimization of address offset assignment for DSP applications, which reduces the number of address arithmetic instructions to meet the tight memory size restrictions an...