Sciweavers

87 search results - page 4 / 18
» Improving the Memory Bandwidth Utilization Using Loop Transf...
Sort
View
ICS
1992
Tsinghua U.
15 years 3 months ago
Optimizing for parallelism and data locality
Previous research has used program transformation to introduce parallelism and to exploit data locality. Unfortunately,these twoobjectives have usuallybeen considered independentl...
Ken Kennedy, Kathryn S. McKinley
LCPC
2009
Springer
15 years 4 months ago
A Balanced Approach to Application Performance Tuning
Abstract. Current hardware trends place increasing pressure on programmers and tools to optimize scientific code. Numerous tools and techniques exist, but no single tool is a pana...
Souad Koliai, Stéphane Zuckerman, Emmanuel ...
ICCAD
1997
IEEE
144views Hardware» more  ICCAD 1997»
15 years 3 months ago
Exploiting off-chip memory access modes in high-level synthesis
Memory-intensive behaviors often contain large arrays that are synthesized into off-chip memories. With the increasing gap between on-chip and off-chip memory access delays, it is...
Preeti Ranjan Panda, Nikil D. Dutt, Alexandru Nico...
MICRO
2000
IEEE
176views Hardware» more  MICRO 2000»
14 years 11 months ago
An Advanced Optimizer for the IA-64 Architecture
level of abstraction, compared with the program representation for scalar optimizations. For example, loop unrolling and loop unrolland-jam transformations exploit the large regist...
Rakesh Krishnaiyer, Dattatraya Kulkarni, Daniel M....
137
Voted
ASPLOS
1994
ACM
15 years 3 months ago
Compiler Optimizations for Improving Data Locality
In the past decade, processor speed has become significantly faster than memory speed. Small, fast cache memories are designed to overcome this discrepancy, but they are only effe...
Steve Carr, Kathryn S. McKinley, Chau-Wen Tseng