Sciweavers

SODA
2008
ACM

Provably good multicore cache performance for divide-and-conquer algorithms

13 years 5 months ago
Provably good multicore cache performance for divide-and-conquer algorithms
This paper presents a multicore-cache model that reflects the reality that multicore processors have both per-processor private (L1) caches and a large shared (L2) cache on chip. We consider a broad class of parallel divide-andconquer algorithms and present a new on-line scheduler, controlled-pdf, that is competitive with the standard sequential scheduler in the following sense. Given any dynamically unfolding computation DAG from this class of algorithms, the cache complexity on the multicore-cache model under our new scheduler is within a constant factor of the sequential cache complexity for both L1 and L2, while the time complexity is within a constant factor of the sequential time complexity divided by the number of processors p. These are the first such asymptoticallyoptimal results for any multicore model. Finally, we show that a separator-based algorithm for sparse-matrix-densevector-multiply achieves provably good cache performance in the multicore-cache model, as well as in ...
Guy E. Blelloch, Rezaul Alam Chowdhury, Phillip B.
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2008
Where SODA
Authors Guy E. Blelloch, Rezaul Alam Chowdhury, Phillip B. Gibbons, Vijaya Ramachandran, Shimin Chen, Michael Kozuch
Comments (0)