In this paper we present a novel processor microarchitecture that relieves three of the most important bottlenecks of superscalar processors: the serialization imposed by true dep...
Trace cache, an instruction fetch technique that reduces taken branch penalties by storing and fetching program instructions in dynamic execution order, dramatically improves inst...
Abstract. In SMT processors several threads run simultaneously to increase available ILP, sharing but competing for resources. The instruction fetch policy plays a key role, determ...
Instruction delivery is a critical component for wide-issue processors since its bandwidth and accuracy place an upper limit on performance. The processor front-end accuracy and ba...
A thread executing on a simultaneous multithreading (SMT) processor that experiences a long-latency load will eventually stall while holding execution resources. Existing long-lat...