Sciweavers

583 search results - page 93 / 117
» NAS Parallel Benchmark Results
Sort
View
HPCA
2007
IEEE
16 years 5 days ago
A Burst Scheduling Access Reordering Mechanism
Utilizing the nonuniform latencies of SDRAM devices, access reordering mechanisms alter the sequence of main memory access streams to reduce the observed access latency. Using a r...
Jun Shao, Brian T. Davis
HPCA
2004
IEEE
16 years 5 days ago
Reducing Branch Misprediction Penalty via Selective Branch Recovery
Branch misprediction penalty consists of two components: the time wasted on mis-speculative execution until the mispredicted branch is resolved and the time to restart the pipelin...
Amit Gandhi, Haitham Akkary, Srikanth T. Srinivasa...
ICCD
2006
IEEE
115views Hardware» more  ICCD 2006»
15 years 8 months ago
Microarchitecture and Performance Analysis of Godson-2 SMT Processor
—This paper introduces the microarchitecture and logical implementation of SMT (Simultaneous Multithreading) improvement of Godson-2 processor which is a 64-bit, four-issue, out-...
Zusong Li, Xianchao Xu, Weiwu Hu, Zhimin Tang
ICCD
2004
IEEE
125views Hardware» more  ICCD 2004»
15 years 8 months ago
IPC Driven Dynamic Associative Cache Architecture for Low Energy
Existing schemes for cache energy optimization incorporate a limited degree of dynamic associativity: either direct mapped or full available associativity (say 4-way). In this pap...
Sriram Nadathur, Akhilesh Tyagi
ICS
2009
Tsinghua U.
15 years 6 months ago
Parametric multi-level tiling of imperfectly nested loops
Tiling is a crucial loop transformation for generating high performance code on modern architectures. Efficient generation of multilevel tiled code is essential for maximizing da...
Albert Hartono, Muthu Manikandan Baskaran, C&eacut...