Trace caches enable high bandwidth, low latency instruction supply, but have a high miss penalty and relatively large working sets. Consequently, their performance may suffer due ...
Vector instruction sets are receiving renewed interest because of their applicability to multimedia. Current multimedia instruction sets use short vectors with SIMD implementation...
For many applications, branch mispredictions and cache misses limit a processor’s performance to a level well below its peak instruction throughput. A small fraction of static i...
The doubling of microprocessor performance every three years has been the result of two factors: more transistors per chip and superlinear scaling of the processor clock with tech...
Vikas Agarwal, M. S. Hrishikesh, Stephen W. Keckle...
Speculative parallelization aggressively executes in parallel codes that cannot be fully parallelized by the compiler. Past proposals of hardware schemes have mostly focused on si...