Sciweavers

ICPP
1999
IEEE
13 years 9 months ago
Trace-Level Reuse
Trace-level reuse is based on the observation that some traces (dynamic sequences of instructions) are frequently repeated during the execution of a program, and in many cases, th...
Antonio González, Jordi Tubella, Carlos Mol...
AES
2000
Springer
107views Cryptology» more  AES 2000»
13 years 9 months ago
Speeding up Serpent
We present a method for finding efficient instruction sequences for the Serpent S-boxes. Current implementations need many registers to store temporary variables, yet the common ...
Dag Arne Osvik
ISPAN
2000
IEEE
13 years 9 months ago
Versatile Processor Design for Efficiency and High Performance
We present new architectural concepts for uniprocessor designs that conform to the data-driven computation paradigm. Usage of our D2 -CPU (Data-Driven processor) follows the natura...
Sotirios G. Ziavras
ICPP
2000
IEEE
13 years 9 months ago
Partial Resolution in Data Value Predictors
Recently, the practice of speculation in resolving data dependences has been studied as a means of extracting more instruction level parallelism (ILP). An outcome of an instructio...
Toshinori Sato, Itsujiro Arita
CASES
2003
ACM
13 years 10 months ago
Reducing code size with echo instructions
In an embedded system, the cost of storing a program onchip can be as high as the cost of a microprocessor. Compressing an application’s code to reduce the amount of memory requ...
Jeremy Lau, Stefan Schoenmackers, Timothy Sherwood...
MICRO
2003
IEEE
101views Hardware» more  MICRO 2003»
13 years 10 months ago
Macro-op Scheduling: Relaxing Scheduling Loop Constraints
Ensuring back-to-back execution of dependent instructions in a conventional out-of-order processor requires scheduling logic that wakes up and selects instructions at the same rat...
Ilhyun Kim, Mikko H. Lipasti
ICS
2004
Tsinghua U.
13 years 10 months ago
Scaling the issue window with look-ahead latency prediction
In contemporary out-of-order superscalar design, high IPC is mainly achieved by exposing high instruction level parallelism (ILP). Scaling issue window size can certainly provide ...
Yongxiang Liu, Anahita Shayesteh, Gokhan Memik, Gl...
ASPDAC
2004
ACM
75views Hardware» more  ASPDAC 2004»
13 years 10 months ago
Power-performance trade-off using pipeline delays
— We study the delays faced by instructions in the pipeline of a superscalar processor and its impact on power and performance. Instructions that are ready-on-dispatch (ROD) are ...
G. Surendra, Subhasis Banerjee, S. K. Nandy
EUROPAR
2005
Springer
13 years 10 months ago
Non-uniform Instruction Scheduling
Dynamic instruction scheduling logic is one of the most critical and cycle-limiting structures in modern superscalar processors, and it is not easily pipelined without significant ...
Joseph J. Sharkey, Dmitry V. Ponomarev
ASPLOS
2006
ACM
13 years 10 months ago
Understanding prediction-based partial redundant threading for low-overhead, high- coverage fault tolerance
Redundant threading architectures duplicate all instructions to detect and possibly recover from transient faults. Several lighter weight Partial Redundant Threading (PRT) archite...
Vimal K. Reddy, Eric Rotenberg, Sailashri Parthasa...