Techniques for aggressive optimization and parallelization of applications can have the side-effect of introducing copy instructions, register-to-register move instructions, into t...
The focus of this paper is on VLIW instruction scheduling that minimizes the variation of power consumed by the processor during the execution of a target program. We use rough set...
Advanced bit manipulation operations are not efficiently supported by commodity word-oriented microprocessors. Programming tricks are typically devised to shorten the long sequence...
Graphics and media processing is quickly emerging to become one of the key computing workloads. Programmable graphics processors give designers extra flexibility by running a sma...
Mauricio Breternitz Jr., Herbert H. J. Hum, Sanjee...
An increasingly large portion of scheduler latency is derived from the monolithic content addressable memory (CAM) arrays accessed during instruction wakeup. The performance of th...