Sciweavers

PARA
2004
Springer

Cache Optimizations for Iterative Numerical Codes Aware of Hardware Prefetching

13 years 9 months ago
Cache Optimizations for Iterative Numerical Codes Aware of Hardware Prefetching
Cache optimizations typically include code transformations to increase the locality of memory accesses. An orthogonal approach is to enable for latency hiding by introducing prefetching techniques. With software prefetching, cache load instructions have to be inserted into the program code. To overcome this complexity for the programmer, modern processers are equipped with hardware prefetching units which predict future memory accesses in order to automatically load data into cache before its use. For optimal performance, it seems advantageous to combine both prefetching approaches. In this contribution, we first use a cache simulation enhanced with a simple hardware prefetcher to run code for a 3D multigrid solver. Cache misses which are not predicted by the prefetcher can be located in simulation results, and selectively, software prefetch instructions can be inserted. However, when performance of a code section is limited by available bandwidth to main memory, this simple strategy ...
Josef Weidendorfer, Carsten Trinitis
Added 02 Jul 2010
Updated 02 Jul 2010
Type Conference
Year 2004
Where PARA
Authors Josef Weidendorfer, Carsten Trinitis
Comments (0)