Sciweavers

70 search results - page 1 / 14
» Improving the Accuracy and Performance of Memory Communicati...
Sort
View
MICRO
1997
IEEE
108views Hardware» more  MICRO 1997»
13 years 9 months ago
Improving the Accuracy and Performance of Memory Communication Through Renaming
As processors continue to exploit more instruction level parallelism, a greater demand is placed on reducing the e ects of memory access latency. In this paper, we introduce a nov...
Gary S. Tyson, Todd M. Austin
ISCA
2003
IEEE
101views Hardware» more  ISCA 2003»
13 years 10 months ago
Overcoming the Limitations of Conventional Vector Processors
Despite their superior performance for multimedia applications, vector processors have three limitations that hinder their widespread acceptance. First, the complexity and size of...
Christoforos E. Kozyrakis, David A. Patterson
JILP
2000
79views more  JILP 2000»
13 years 4 months ago
A Comparative Survey of Load Speculation Architectures
Load latency remains a signi cant bottleneck in dynamically scheduled pipelined processors. Load speculation techniques have been proposed to reduce this latency. Dependence Predi...
Brad Calder, Glenn Reinman
EUROPAR
2001
Springer
13 years 9 months ago
Performance of High-Accuracy PDE Solvers on a Self-Optimizing NUMA Architecture
High-accuracy PDE solvers use multi-dimensional fast Fourier transforms. The FFTs exhibits a static and structured memory access pattern which results in a large amount of communic...
Sverker Holmgren, Dan Wallin
SC
2004
ACM
13 years 10 months ago
The Potential of Computation Regrouping for Improving Locality
Improving program locality has become increasingly important on modern computer systems. An effective strategy is to group computations on the same data so that once the data are ...
Chen Ding, Maksim Orlovich