Sciweavers

PPSC
1997

Improving Memory-System Performance of Sparse Matrix-Vector Multiplication

13 years 5 months ago
Improving Memory-System Performance of Sparse Matrix-Vector Multiplication
Sparse matrix-vector multiplication is an important kernel that often runs inefficiently on superscalar RISC processors. This paper describes techniques that increase instruction-level parallelism and improve performance. The techniques include reordering to reduce cache misses originally due to Das et al., blocking to reduce load instructions, and prefetching to prevent multiple load-store units from stalling simultaneously. The techniques improve performance from about 40 Mflops (on a well-ordered matrix) to over 100 Mflops on a 266 Mflops machine.
Sivan Toledo
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 1997
Where PPSC
Authors Sivan Toledo
Comments (0)