Trace-level reuse is based on the observation that some traces (dynamic sequences of instructions) are frequently repeated during the execution of a program, and in many cases, th...
We present a method for finding efficient instruction sequences for the Serpent S-boxes. Current implementations need many registers to store temporary variables, yet the common ...
We present new architectural concepts for uniprocessor designs that conform to the data-driven computation paradigm. Usage of our D2 -CPU (Data-Driven processor) follows the natura...
Recently, the practice of speculation in resolving data dependences has been studied as a means of extracting more instruction level parallelism (ILP). An outcome of an instructio...
In an embedded system, the cost of storing a program onchip can be as high as the cost of a microprocessor. Compressing an application’s code to reduce the amount of memory requ...
Jeremy Lau, Stefan Schoenmackers, Timothy Sherwood...
Ensuring back-to-back execution of dependent instructions in a conventional out-of-order processor requires scheduling logic that wakes up and selects instructions at the same rat...
In contemporary out-of-order superscalar design, high IPC is mainly achieved by exposing high instruction level parallelism (ILP). Scaling issue window size can certainly provide ...
— We study the delays faced by instructions in the pipeline of a superscalar processor and its impact on power and performance. Instructions that are ready-on-dispatch (ROD) are ...
Dynamic instruction scheduling logic is one of the most critical and cycle-limiting structures in modern superscalar processors, and it is not easily pipelined without significant ...
Redundant threading architectures duplicate all instructions to detect and possibly recover from transient faults. Several lighter weight Partial Redundant Threading (PRT) archite...
Vimal K. Reddy, Eric Rotenberg, Sailashri Parthasa...