Sciweavers

421 search results - page 35 / 85
» An Intelligent Parallel Loop Scheduling for Parallelizing Co...
Sort
View
ISPAN
1997
IEEE
15 years 1 months ago
CASS: an efficient task management system for distributed memory architectures
The thesis of this research is that the task of exposing the parallelism in a given application should be left to the algorithm designer, who has intimate knowledge of the applica...
Jing-Chiou Liou, Michael A. Palis
ISCA
1993
IEEE
125views Hardware» more  ISCA 1993»
15 years 1 months ago
Evaluation of Mechanisms for Fine-Grained Parallel Programs in the J-Machine and the CM-5
er uses an abstract machine approach to compare the mechanisms of two parallel machines: the J-Machine and the CM-5. High-level parallel programs are translated by a single optimi...
Ellen Spertus, Seth Copen Goldstein, Klaus E. Scha...
80
Voted
LCPC
2009
Springer
15 years 2 months ago
Loop Transformation Recipes for Code Generation and Auto-Tuning
Abstract. In this paper, we describe transformation recipes, which provide a high-level interface to the code transformation and code generation capability of a compiler. These rec...
Mary W. Hall, Jacqueline Chame, Chun Chen, Jaewook...
78
Voted
PLDI
1995
ACM
15 years 1 months ago
Interprocedural Partial Redundancy Elimination and its Application to Distributed Memory Compilation
Partial Redundancy Elimination PRE is a general scheme for suppressing partial redundancies which encompasses traditional optimizations like loop invariant code motion and redun...
Gagan Agrawal, Joel H. Saltz, Raja Das
ICS
2000
Tsinghua U.
15 years 1 months ago
Synthesizing transformations for locality enhancement of imperfectly-nested loop nests
We present an approach for synthesizing transformations to enhance locality in imperfectly-nested loops. The key idea is to embed the iteration space of every statement in a loop ...
Nawaaz Ahmed, Nikolay Mateev, Keshav Pingali