Sciweavers

315 search results - page 2 / 63
» On reducing load store latencies of cache accesses
Sort
View
MICRO
1995
IEEE
102views Hardware» more  MICRO 1995»
13 years 8 months ago
Zero-cycle loads: microarchitecture support for reducing load latency
Untolerated load instruction latencies often have a significant impact on overall program performance. As one means of mitigating this effect, we present an aggressive hardware-b...
Todd M. Austin, Gurindar S. Sohi
ICS
1998
Tsinghua U.
13 years 9 months ago
Load Execution Latency Reduction
In order to achieve high performance, contemporary microprocessors must effectively process the four major instruction types: ALU, branch, load, and store instructions. This paper...
Bryan Black, Brian Mueller, Stephanie Postal, Ryan...
FPL
2007
Springer
136views Hardware» more  FPL 2007»
13 years 11 months ago
A Load/Store Unit for a Memcpy Hardware Accelerator
Recently, a dedicated hardware accelerator was proposed that works in conjunction with caches found next to modern-day microprocessors, to speedup the commonly utilized memcpy ope...
Stamatis Vassiliadis, Filipa Duarte, Stephan Wong
MICRO
2005
IEEE
110views Hardware» more  MICRO 2005»
13 years 10 months ago
Scalable Store-Load Forwarding via Store Queue Index Prediction
Conventional processors use a fully-associative store queue (SQ) to implement store-load forwarding. Associative search latency does not scale well to capacities and bandwidths re...
Tingting Sha, Milo M. K. Martin, Amir Roth
ICCD
2006
IEEE
92views Hardware» more  ICCD 2006»
14 years 1 months ago
Fast Speculative Address Generation and Way Caching for Reducing L1 Data Cache Energy
— L1 data caches in high-performance processors continue to grow in set associativity. Higher associativity can significantly increase the cache energy consumption. Cache access...
Dan Nicolaescu, Babak Salamat, Alexander V. Veiden...