Sciweavers

MICRO
2002
IEEE
97views Hardware» more  MICRO 2002»
13 years 9 months ago
Instruction fetch deferral using static slack
In this paper we present an approach to boosting performance and tolerating latency by deferring non-critical instructions into a deferred queue for later processing. As such, ins...
Gregory A. Muthler, David Crowe, Sanjay J. Patel, ...
ISCA
2002
IEEE
93views Hardware» more  ISCA 2002»
13 years 9 months ago
Transient-Fault Recovery Using Simultaneous Multithreading
We propose a scheme for transient-fault recovery called Simultaneously and Redundantly Threaded processors with Recovery (SRTR) that enhances a previously proposed scheme for tran...
T. N. Vijaykumar, Irith Pomeranz, Karl Cheng
ISCA
2002
IEEE
103views Hardware» more  ISCA 2002»
13 years 9 months ago
Efficient Dynamic Scheduling Through Tag Elimination
An increasingly large portion of scheduler latency is derived from the monolithic content addressable memory (CAM) arrays accessed during instruction wakeup. The performance of th...
Dan Ernst, Todd M. Austin
DSD
2002
IEEE
90views Hardware» more  DSD 2002»
13 years 9 months ago
Simplifying Instruction Issue Logic in Superscalar Processors
Modern microprocessors schedule instructions dynamically in order to exploit instruction-level parallelism. It is necessary to increase instruction window size for improving instr...
Toshinori Sato, Itsujiro Arita
DAC
2003
ACM
13 years 9 months ago
Instruction set compiled simulation: a technique for fast and flexible instruction set simulation
Instruction set simulators are critical tools for the exploration and validation of new programmable architectures. Due to increasing complexity of the architectures and timeto-ma...
Mehrdad Reshadi, Prabhat Mishra, Nikil D. Dutt
ISCA
2003
IEEE
108views Hardware» more  ISCA 2003»
13 years 9 months ago
Effective ahead Pipelining of Instruction Block Address Generation
On a N-way issue superscalar processor, the front end instruction fetch engine must deliver instructions to the execution core at a sustained rate higher than N instructions per c...
André Seznec, Antony Fraboulet
ISCA
2003
IEEE
150views Hardware» more  ISCA 2003»
13 years 9 months ago
Cyclone: A Broadcast-Free Dynamic Instruction Scheduler with Selective Replay
To achieve high instruction throughput, instruction schedulers must be capable of producing high-quality schedules that maximize functional unit utilization while at the same time...
Dan Ernst, Andrew Hamel, Todd M. Austin
DATE
2003
IEEE
102views Hardware» more  DATE 2003»
13 years 9 months ago
G-MAC: An Application-Specific MAC/Co-Processor Synthesizer
: A modern special-purpose processor (e.g., for image and graphical applications) usually contains a set of instructions supporting complex multiply-operations. These instructions ...
Alex C.-Y. Chang, Wu-An Kuo, Allen C.-H. Wu, TingT...
IPPS
2005
IEEE
13 years 10 months ago
Control-Flow Independence Reuse via Dynamic Vectorization
Current processors exploit out-of-order execution and branch prediction to improve instruction level parallelism. When a branch prediction is wrong, processors flush the pipeline ...
Alex Pajuelo, Antonio González, Mateo Valer...
IEEEPACT
2005
IEEE
13 years 10 months ago
A Distributed Control Path Architecture for VLIW Processors
VLIW architectures are popular in embedded systems because they offer high-performance processing at low cost and energy. The major problem with traditional VLIW designs is that t...
Hongtao Zhong, Kevin Fan, Scott A. Mahlke, Michael...