Sciweavers

5553 search results - page 367 / 1111
» Parallel Implementation of Sch
Sort
View
LCPC
1995
Springer
15 years 8 months ago
Demand-Driven, Symbolic Range Propagation
Abstract. To e ectively parallelize real programs, parallelizing compilers need powerful symbolic analysis techniques 13, 6]. In previous work we have introduced an algorithm calle...
William Blume, Rudolf Eigenmann
CAMP
2005
IEEE
15 years 6 months ago
Energy/Performance Evaluation of the Multithreaded Extension of a Multicluster VLIW Processor
Abstract— In this paper we address the problem of the architectural exploration from the energy/performance point of view of a VLIW processor for embedded systems. We also consid...
Domenico Barretta, Gianluca Palermo, Mariagiovanna...
PVM
2010
Springer
15 years 2 months ago
Load Balancing for Regular Meshes on SMPs with MPI
Abstract. Domain decomposition for regular meshes on parallel computers has traditionally been performed by attempting to exactly partition the work among the available processors ...
Vivek Kale, William Gropp
PPOPP
2010
ACM
16 years 2 months ago
Model-driven autotuning of sparse matrix-vector multiply on GPUs
We present a performance model-driven framework for automated performance tuning (autotuning) of sparse matrix-vector multiply (SpMV) on systems accelerated by graphics processing...
Jee W. Choi, Amik Singh, Richard W. Vuduc
ISPDC
2008
IEEE
15 years 11 months ago
Heterogeneous PBLAS: Optimization of PBLAS for Heterogeneous Computational Clusters
This paper presents a package, called Heterogeneous PBLAS (HeteroPBLAS), which is built on top of PBLAS and provides optimized parallel basic linear algebra subprograms for hetero...
Ravi Reddy Manumachu, Alexey L. Lastovetsky, Pedro...