Implicit and explicit optimizations for stencil computations

16 years 7 days ago

Download crd.lbl.gov

Stencil-based kernels constitute the core of many scientiﬁc applications on block-structured grids. Unfortunately, these codes achieve a low fraction of peak performance, due primarily to the disparity between processor and main memory speeds. We examine several optimizations on both the conventional cache-based memory systems of the Itanium 2, Opteron, and Power5, as well as the heterogeneous multicore design of the Cell processor. The optimizations target cache reuse across stencil sweeps, including both an implicit cache oblivious approach and a cache-aware algorithm blocked to match the cache structure. Finally, we consider stencil computations on a machine with an explicitlymanaged memory hierarchy, the Cell processor. Overall, results show that a cache-aware approach is signiﬁcantly faster than a cache oblivious approach and that the explicitly managed memory on Cell is more eﬃcient: Relative to the Power5, it has almost 2x more memory bandwidth and is 3.7x faster.

Shoaib Kamil, Kaushik Datta, Samuel Williams, Leon

Real-time Traffic

ACMMSP 2006 | Cache Oblivious Approach | Cell Processor | Hardware | Optimizations Target Cache |

claim paper

» CkDirect Unsynchronized OneSided Communication in a MessageDriven Paradigm

» Stochastic Motion and the Level Set Method in Computer Vision Stochastic Active Contours

» Defining implicit objective functions for design problems

» TFMAP optimizing MAP for topn contextaware recommendation

» Exploring medical data using visual spaces with genetic programming and implicit functiona...

» Optimal sampling from sliding windows

» Synthesis and optimization of coordination controllers for distributed embedded systems

» Three Ways to Grow Designs A Comparison of Embryogenies for an Evolutionary Design Problem

Post Info
More Details (n/a)

Added	13 Jun 2010
Updated	13 Jun 2010
Type	Conference
Year	2006
Where	ACMMSP
Authors	Shoaib Kamil, Kaushik Datta, Samuel Williams, Leonid Oliker, John Shalf, Katherine A. Yelick

Comments (0)

Sciweavers

Implicit and explicit optimizations for stencil computations

ACMMSP 2006 | Cache Oblivious Approach | Cell Processor | Hardware | Optimizations Target Cache |

Explore & Download

Productivity Tools

Sciweavers