Sciweavers

363 search results - page 39 / 73
» Optimizing Memory Accesses For Spatial Computation
Sort
View
164
Voted
ASPLOS
1994
ACM
15 years 7 months ago
Compiler Optimizations for Improving Data Locality
In the past decade, processor speed has become significantly faster than memory speed. Small, fast cache memories are designed to overcome this discrepancy, but they are only effe...
Steve Carr, Kathryn S. McKinley, Chau-Wen Tseng
116
Voted
IPPS
2008
IEEE
15 years 9 months ago
Lattice Boltzmann simulation optimization on leading multicore platforms
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of searchbased performance optimizatio...
Samuel Williams, Jonathan Carter, Leonid Oliker, J...
242
Voted
CGF
2011
14 years 9 months ago
A Parallel SPH Implementation on Multi-Core CPUs
This paper presents a parallel framework for simulating fluids with the Smoothed Particle Hydrodynamics (SPH) method. For low computational costs per simulation step, efficient ...
Markus Ihmsen, Nadir Akinci, Markus Becker, Matthi...
153
Voted
ICPP
2009
IEEE
15 years 9 months ago
Perfomance Models for Blocked Sparse Matrix-Vector Multiplication Kernels
—Sparse Matrix-Vector multiplication (SpMV) is a very challenging computational kernel, since its performance depends greatly on both the input matrix and the underlying architec...
Vasileios Karakasis, Georgios I. Goumas, Nectarios...
116
Voted
ICS
2000
Tsinghua U.
15 years 6 months ago
Fast greedy weighted fusion
Loop fusion is important to optimizing compilers because it is an important tool in managing the memory hierarchy. By fusing loops that use the same data elements, we can reduce t...
Ken Kennedy