We describe a radically new cache architecture and demonstrate that it offers a huge reduction in cache cost, size and power consumption whilst maintaining performance on a wide ra...
The compiler is generally regarded as the most important software component that supports a processor design to achieve success. This paper describes our application of the open re...
Modern microprocessors can achieve high performance on linear algebra kernels but this currently requires extensive machine-speci c hand tuning. We have developed a methodology wh...
Jeff Bilmes, Krste Asanovic, Chee-Whye Chin, James...
In the last decade, instruction-set simulators have become an essential development tool for the design of new programmable architectures. Consequently, the simulator performance ...
Achim Nohl, Gunnar Braun, Oliver Schliebusch, Rain...
—Thread-level parallelism (TLP) has been extensively studied in order to overcome the limitations of exploiting instruction-level parallelism (ILP) on high-performance superscala...