Sciweavers

52 search results - page 2 / 11
» Loop Alignment for Memory Accesses Optimization
Sort
View
CASES
2008
ACM
13 years 7 months ago
Efficient vectorization of SIMD programs with non-aligned and irregular data access hardware
Automatic vectorization of programs for partitioned-ALU SIMD (Single Instruction Multiple Data) processors has been difficult because of not only data dependency issues but also n...
Hoseok Chang, Wonyong Sung
CONCURRENCY
2006
140views more  CONCURRENCY 2006»
13 years 5 months ago
An efficient memory operations optimization technique for vector loops on Itanium 2 processors
To keep up with a large degree of instruction level parallelism (ILP), the Itanium 2 cache systems use a complex organization scheme: load/store queues, banking and interleaving. ...
William Jalby, Christophe Lemuet, Sid Ahmed Ali To...
MICRO
2000
IEEE
176views Hardware» more  MICRO 2000»
13 years 5 months ago
An Advanced Optimizer for the IA-64 Architecture
level of abstraction, compared with the program representation for scalar optimizations. For example, loop unrolling and loop unrolland-jam transformations exploit the large regist...
Rakesh Krishnaiyer, Dattatraya Kulkarni, Daniel M....
ICPP
1998
IEEE
13 years 10 months ago
A memory-layout oriented run-time technique for locality optimization
Exploiting locality at run-time is a complementary approach to a compiler approach for those applications with dynamic memory access patterns. This paper proposes a memory-layout ...
Yong Yan, Xiaodong Zhang, Zhao Zhang
ICS
1992
Tsinghua U.
13 years 9 months ago
Optimizing for parallelism and data locality
Previous research has used program transformation to introduce parallelism and to exploit data locality. Unfortunately,these twoobjectives have usuallybeen considered independentl...
Ken Kennedy, Kathryn S. McKinley