Sciweavers

379 search results - page 6 / 76
» Optimal loop parallelization for maximizing iteration-level ...
Sort
View
MICRO
2005
IEEE
130views Hardware» more  MICRO 2005»
15 years 3 months ago
Exploiting Vector Parallelism in Software Pipelined Loops
An emerging trend in processor design is the addition of short vector instructions to general-purpose and embedded ISAs. Frequently, these extensions are employed using traditiona...
Samuel Larsen, Rodric M. Rabbah, Saman P. Amarasin...
IEEEPACT
1998
IEEE
15 years 1 months ago
A Matrix-Based Approach to the Global Locality Optimization Problem
Global locality analysis is a technique for improving the cache performance of a sequence of loop nests through a combination of loop and data layout optimizations. Pure loop tran...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
IFL
2005
Springer
107views Formal Methods» more  IFL 2005»
15 years 2 months ago
With-Loop Fusion for Data Locality and Parallelism
With-loops are versatile array comprehensions used in the functional array language SaC to implement universally applicable array operations. We describe the fusion of with-loops a...
Clemens Grelck, Karsten Hinckfuß, Sven-Bodo ...
120
Voted
MICRO
1995
IEEE
217views Hardware» more  MICRO 1995»
15 years 29 days ago
Improving instruction-level parallelism by loop unrolling and dynamic memory disambiguation
Exploitation ofinstruction-levelparallelism is an ejfective mechanism for improving the performance of modern super-scalar/VLIW processors. Various software techniques can be appl...
Jack W. Davidson, Sanjay Jinturkar
ICS
2000
Tsinghua U.
15 years 1 months ago
Automatic loop transformations and parallelization for Java
From a software engineering perspective, the Java programming language provides an attractive platform for writing numerically intensive applications. A major drawback hampering i...
Pedro V. Artigas, Manish Gupta, Samuel P. Midkiff,...