Sciweavers

ISCA
1999
IEEE
96views Hardware» more  ISCA 1999»
13 years 8 months ago
PipeRench: A Coprocessor for Streaming multimedia Acceleration
Future computing workloads will emphasize an architecture's ability to perform relatively simple calculations on massive quantities of mixed-width data. This paper describes ...
Seth Copen Goldstein, Herman Schmit, Matthew Moe, ...
ISCA
1999
IEEE
104views Hardware» more  ISCA 1999»
13 years 8 months ago
Is SC + ILP=RC?
Sequential consistency (SC) is the simplest programming interface for shared-memory systems but imposes program order among all memory operations, possibly precluding high perform...
Chris Gniady, Babak Falsafi, T. N. Vijaykumar
ISCA
1999
IEEE
89views Hardware» more  ISCA 1999»
13 years 8 months ago
Simultaneous Subordinate Microthreading (SSMT)
Current work in Simultaneous Multithreading provides little benefit to programs that aren't partitioned into threads. We propose Simultaneous Subordinate Microthreading (SSMT...
Robert S. Chappell, Jared Stark, Sangwook P. Kim, ...
ISCA
1999
IEEE
90views Hardware» more  ISCA 1999»
13 years 8 months ago
Selective Value Prediction
Value Prediction is a relatively new technique to increase instruction-level parallelism by breaking true data dependence chains. A value prediction architecture produces values, ...
Brad Calder, Glenn Reinman, Dean M. Tullsen
ISCA
1999
IEEE
94views Hardware» more  ISCA 1999»
13 years 8 months ago
A Performance Comparison of Contemporary DRAM Architectures
In response to the growing gap between memory access time and processor speed, DRAM manufacturers have created several new DRAM architectures. This paper presents a simulation-bas...
Vinodh Cuppu, Bruce L. Jacob, Brian Davis, Trevor ...
ISCA
1999
IEEE
110views Hardware» more  ISCA 1999»
13 years 8 months ago
Decoupling Local Variable Accesses in a Wide-Issue Superscalar Processor
Providing adequate data bandwidth is extremely important for a wide-issue superscalar processor to achieve its full performance potential. Adding a large number of ports to a data...
Sangyeun Cho, Pen-Chung Yew, Gyungho Lee
ISCA
1999
IEEE
124views Hardware» more  ISCA 1999»
13 years 8 months ago
The Block-Based Trace Cache
The trace cache is a recently proposed solution to achieving high instruction fetch bandwidth by buffering and reusing dynamic instruction traces. This work presents a new block-b...
Bryan Black, Bohuslav Rychlik, John Paul Shen
ISCA
1999
IEEE
98views Hardware» more  ISCA 1999»
13 years 8 months ago
Multicast Snooping: A New Coherence Method Using a Multicast Address Network
This paper proposes a new coherence method called "multicast snooping" that dynamically adapts between broadcast snooping and a directory protocol. Multicast snooping is...
E. Ender Bilir, Ross M. Dickson, Ying Hu, Manoj Pl...
ISCA
1999
IEEE
105views Hardware» more  ISCA 1999»
13 years 8 months ago
The Program Decision Logic Approach to Predicated Execution
Modern compilers must expose sufficient amounts of Instruction-Level Parallelism (ILP) to achieve the promised performance increases of superscalar and VLIW processors. One of the...
David I. August, John W. Sias, Jean-Michel Puiatti...
ISCA
1999
IEEE
90views Hardware» more  ISCA 1999»
13 years 8 months ago
Correlated Load-Address Predictors
As microprocessors become faster, the relative performance cost of memory accesses increases. Bigger and faster caches significantly reduce the absolute load-to-use time delay. Ho...
Michael Bekerman, Stéphan Jourdan, Ronny Ro...