Sciweavers

1370 search results - page 185 / 274
» Synchronization Transformations for Parallel Computing
Sort
View
SPAA
2010
ACM
15 years 2 months ago
Buffer-space efficient and deadlock-free scheduling of stream applications on multi-core architectures
We present a scheduling algorithm of stream programs for multi-core architectures called team scheduling. Compared to previous multi-core stream scheduling algorithms, team schedu...
JongSoo Park, William J. Dally
139
Voted
IPPS
2010
IEEE
14 years 11 months ago
Efficient hardware support for the Partitioned Global Address Space
We present a novel architecture of a communication engine for non-coherent distributed shared memory systems. The shared memory is composed by a set of nodes exporting their memory...
Holger Fröning, Heiner Litz
STACS
1995
Springer
15 years 5 months ago
Generalized Scans and Tri-Diagonal Systems
Motivatedby the analysis of known parallel techniques for the solution of linear tridiagonal system, we introduce generalized scans, a class of recursively de ned lengthpreserving...
Paul F. Fischer, Franco P. Preparata, John E. Sava...
122
Voted
CGO
2008
IEEE
15 years 8 months ago
Parallel-stage decoupled software pipelining
In recent years, the microprocessor industry has embraced chip multiprocessors (CMPs), also known as multi-core architectures, as the dominant design paradigm. For existing and ne...
Easwaran Raman, Guilherme Ottoni, Arun Raman, Matt...
IPPS
2003
IEEE
15 years 7 months ago
Global Communication Optimization for Tensor Contraction Expressions under Memory Constraints
The accurate modeling of the electronic structure of atoms and molecules involves computationally intensive tensor contractions involving large multi-dimensional arrays. The effi...
Daniel Cociorva, Xiaoyang Gao, Sandhya Krishnan, G...