Sciweavers

87 search results - page 2 / 18
» Improving the Memory Bandwidth Utilization Using Loop Transf...
Sort
View
PCI
2005
Springer
13 years 10 months ago
Tuning Blocked Array Layouts to Exploit Memory Hierarchy in SMT Architectures
Cache misses form a major bottleneck for memory-intensive applications, due to the significant latency of main memory accesses. Loop tiling, in conjunction with other program tran...
Evangelia Athanasaki, Kornilios Kourtis, Nikos Ana...
ICPP
1999
IEEE
13 years 9 months ago
A Framework for Interprocedural Locality Optimization Using Both Loop and Data Layout Transformations
There has been much work recently on improving the locality performance of loop nests in scientific programs through the use of loop as well as data layout optimizations. However,...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
CF
2006
ACM
13 years 11 months ago
Improving the memory behavior of vertical filtering in the discrete wavelet transform
The discrete wavelet transform (DWT) is used in several image and video compression standards, in particular JPEG2000. A 2D DWT consists of horizontal filtering along the rows fo...
Asadollah Shahbahrami, Ben H. H. Juurlink, Stamati...
TPDS
1998
157views more  TPDS 1998»
13 years 4 months ago
A Compiler Optimization Algorithm for Shared-Memory Multiprocessors
This paper presents a new compiler optimization algorithm that parallelizes applications for symmetric, sharedmemory multiprocessors. The algorithm considers data locality, parall...
Kathryn S. McKinley
IPPS
2007
IEEE
13 years 11 months ago
Optimizing Inter-Nest Data Locality Using Loop Splitting and Reordering
With the increasing gap between processor speed and memory latency, the performance of data-dominated programs are becoming more reliant on fast data access, which can be improved...
Sofiane Naci