Sciweavers

MICRO
2005
IEEE
108views Hardware» more  MICRO 2005»
13 years 10 months ago
How to Fake 1000 Registers
Large numbers of logical registers can improve performance by allowing fast access to multiple subroutine contexts (register windows) and multiple thread contexts (multithreading)...
David W. Oehmke, Nathan L. Binkert, Trevor N. Mudg...
MICRO
2005
IEEE
130views Hardware» more  MICRO 2005»
13 years 10 months ago
Exploiting Vector Parallelism in Software Pipelined Loops
An emerging trend in processor design is the addition of short vector instructions to general-purpose and embedded ISAs. Frequently, these extensions are employed using traditiona...
Samuel Larsen, Rodric M. Rabbah, Saman P. Amarasin...
MICRO
2005
IEEE
139views Hardware» more  MICRO 2005»
13 years 10 months ago
Shader Performance Analysis on a Modern GPU Architecture
This paper presents an analysis of the performance of the shader processing units in a modern Graphics Processor Unit (GPU) architecture using real graphic applications. The archi...
Victor Moya Del Barrio, Carlos González, Jo...
MICRO
2005
IEEE
113views Hardware» more  MICRO 2005»
13 years 10 months ago
Thermal Management of On-Chip Caches Through Power Density Minimization
Various architectural power reduction techniques have been proposed for on-chip caches in the last decade. In this paper, we first show that these power reduction techniques can b...
Ja Chun Ku, Serkan Ozdemir, Gokhan Memik, Yehea I....
MICRO
2005
IEEE
133views Hardware» more  MICRO 2005»
13 years 10 months ago
Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution
Predicated execution has been used to reduce the number of branch mispredictions by eliminating hard-to-predict branches. However, the additional instruction overhead and addition...
Hyesoon Kim, Onur Mutlu, Jared Stark, Yale N. Patt
MICRO
2005
IEEE
125views Hardware» more  MICRO 2005»
13 years 10 months ago
Improving Region Selection in Dynamic Optimization Systems
The performance of a dynamic optimization system depends heavily on the code it selects to optimize. Many current systems follow the design of HP Dynamo and select a single interp...
David Hiniker, Kim M. Hazelwood, Michael D. Smith
MICRO
2005
IEEE
107views Hardware» more  MICRO 2005»
13 years 10 months ago
Stream Programming on General-Purpose Processors
— In this paper we investigate mapping stream programs (i.e., programs written in a streaming style for streaming architectures such as Imagine and Raw) onto a general-purpose CP...
Jayanth Gummaraju, Mendel Rosenblum
MICRO
2005
IEEE
126views Hardware» more  MICRO 2005»
13 years 10 months ago
Cost Sensitive Modulo Scheduling in a Loop Accelerator Synthesis System
Scheduling algorithms used in compilers traditionally focus on goals such as reducing schedule length and register pressure or producing compact code. In the context of a hardware...
Kevin Fan, Manjunath Kudlur, Hyunchul Park, Scott ...
MICRO
2005
IEEE
99views Hardware» more  MICRO 2005»
13 years 10 months ago
uComplexity: Estimating Processor Design Effort
Cyrus Bazeghi, Francisco J. Mesa-Martinez, Jose Re...