Sciweavers

379 search results - page 9 / 76
» Optimal loop parallelization for maximizing iteration-level ...
Sort
View
PPOPP
2005
ACM
15 years 3 months ago
Performance modeling and optimization of parallel out-of-core tensor contractions
The Tensor Contraction Engine (TCE) is a domain-specific compiler for implementing complex tensor contraction expressions arising in quantum chemistry applications modeling elect...
Xiaoyang Gao, Swarup Kumar Sahoo, Chi-Chung Lam, J...
IEEEPACT
2007
IEEE
15 years 3 months ago
Automatic Correction of Loop Transformations
Loop nest optimization is a combinatorial problem. Due to the growing complexity of modern architectures, it involves two increasingly difficult tasks: (1) analyzing the profita...
Nicolas Vasilache, Albert Cohen, Louis-Noël P...
IEEEPACT
2002
IEEE
15 years 2 months ago
Optimizing Loop Performance for Clustered VLIW Architectures
Modern embedded systems often require high degrees of instruction-level parallelism (ILP) within strict constraints on power consumption and chip cost. Unfortunately, a high-perfo...
Yi Qian, Steve Carr, Philip H. Sweany
ICS
2000
Tsinghua U.
15 years 1 months ago
Fast greedy weighted fusion
Loop fusion is important to optimizing compilers because it is an important tool in managing the memory hierarchy. By fusing loops that use the same data elements, we can reduce t...
Ken Kennedy
EUROPAR
2001
Springer
15 years 1 months ago
Loop-Carried Code Placement
Abstract. Traditional code optimization techniques treat loops as nonpredictable structures and do not consider expressions containing array accesses for optimization. We show that...
Peter Faber, Martin Griebl, Christian Lengauer