169
click to vote
ICS
15 years 7 months ago
1995 Tsinghua U.
Current data cache organizations fail to deliver high performance in scalar processors for many vector applications. There are two main reasons for this loss of performance: the u...
141
click to vote
ICS
15 years 7 months ago
1995 Tsinghua U.
In this paper we give a new run–time technique for finding an optimal parallel execution schedule for a partially parallel loop, i.e., a loop whose parallelization requires syn...
131
click to vote
ICS
15 years 7 months ago
1995 Tsinghua U.
Modulo scheduling is an e cient technique for exploiting instruction level parallelism in a variety of loops, resulting in high performance code but increased register requirement...
115
click to vote
ICS
15 years 7 months ago
1995 Tsinghua U.
The elimination of induction variables and the parallelization of reductions in FORTRAN programs have been shown to be integral to performance improvement on parallel computers 7,...
111
click to vote
ICS
15 years 7 months ago
1995 Tsinghua U.
In this paper, we present a GSA-based technique that performs more e cient and more precise symbolic analysis of predicated assignments, recurrences and index arrays. The e ciency...
|