Sciweavers

421 search results - page 35 / 85
» An Intelligent Parallel Loop Scheduling for Parallelizing Co...
Sort
View
ISPAN
1997
IEEE
15 years 10 months ago
CASS: an efficient task management system for distributed memory architectures
The thesis of this research is that the task of exposing the parallelism in a given application should be left to the algorithm designer, who has intimate knowledge of the applica...
Jing-Chiou Liou, Michael A. Palis
ISCA
1993
IEEE
125views Hardware» more  ISCA 1993»
15 years 10 months ago
Evaluation of Mechanisms for Fine-Grained Parallel Programs in the J-Machine and the CM-5
er uses an abstract machine approach to compare the mechanisms of two parallel machines: the J-Machine and the CM-5. High-level parallel programs are translated by a single optimi...
Ellen Spertus, Seth Copen Goldstein, Klaus E. Scha...
178
Voted
LCPC
2009
Springer
15 years 11 months ago
Loop Transformation Recipes for Code Generation and Auto-Tuning
Abstract. In this paper, we describe transformation recipes, which provide a high-level interface to the code transformation and code generation capability of a compiler. These rec...
Mary W. Hall, Jacqueline Chame, Chun Chen, Jaewook...
PLDI
1995
ACM
15 years 10 months ago
Interprocedural Partial Redundancy Elimination and its Application to Distributed Memory Compilation
Partial Redundancy Elimination PRE is a general scheme for suppressing partial redundancies which encompasses traditional optimizations like loop invariant code motion and redun...
Gagan Agrawal, Joel H. Saltz, Raja Das
162
Voted
ICS
2000
Tsinghua U.
15 years 10 months ago
Synthesizing transformations for locality enhancement of imperfectly-nested loop nests
We present an approach for synthesizing transformations to enhance locality in imperfectly-nested loops. The key idea is to embed the iteration space of every statement in a loop ...
Nawaaz Ahmed, Nikolay Mateev, Keshav Pingali