This paper proposes and evaluates single-ISA heterogeneous multi-core architectures as a mechanism to reduce processor power dissipation. Our design incorporates heterogeneous cor...
Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, P...
Many dynamic optimization and/or binary translation systems hold optimized/translated superblocks in a code cache. Conventional code caching systems suffer from overheads when con...
Ensuring back-to-back execution of dependent instructions in a conventional out-of-order processor requires scheduling logic that wakes up and selects instructions at the same rat...
Abstract—High-speed routers often use commodity, fully-associative, TCAMs (Ternary Content Addressable Memories) to perform packet classification and routing (IP lookup). We prop...
A dynamic optimizer is a runtime software system that groups a program’s instruction sequences into traces, optimizes those traces, stores the optimized traces in a softwarebase...
The impact of pipeline length on both the power and performance of a microprocessor is explored both theoretically and by simulation. A theory is presented for a wide range of pow...
Microarchitectural prediction based on neural learning has received increasing attention in recent years. However, neural prediction remains impractical because its superior accur...
Wire delays are a major concern for current and forthcoming processors. One approach to attack this problem is to divide the processor into semi-independent units referred to as c...
With power dissipation becoming an increasingly vexing problem across many classes of computer systems, measuring power dissipation of real, running systems has become crucial for...
In this paper we address the design of a future high-speed router that supports line rates as high as OC-3072 (160 Gb/s), around one hundred ports and several service classes. Bui...