Sciweavers

718 search results - page 68 / 144
» Pillar: A Parallel Implementation Language
Sort
View
ASPLOS
2009
ACM
16 years 4 months ago
QR decomposition on GPUs
QR decomposition is a computationally intensive linear algebra operation that factors a matrix A into the product of a unitary matrix Q and upper triangular matrix R. Adaptive sys...
Andrew Kerr, Dan Campbell, Mark Richards
ICPP
2008
IEEE
15 years 9 months ago
Solving Large, Irregular Graph Problems Using Adaptive Work-Stealing
Solving large, irregular graph problems efficiently is challenging. Current software systems and commodity multiprocessors do not support fine-grained, irregular parallelism wel...
Guojing Cong, Sreedhar B. Kodali, Sriram Krishnamo...
IWOMP
2009
Springer
15 years 10 months ago
Scalability Evaluation of Barrier Algorithms for OpenMP
OpenMP relies heavily on barrier synchronization to coordinate the work of threads that are performing the computations in a parallel region. A good implementation of barriers is ...
Ramachandra C. Nanjegowda, Oscar Hernandez, Barbar...
143
Voted
SAC
2003
ACM
15 years 8 months ago
Coordination-Based Distributed Constraint Solving in DICE
DICE (DIstributed Constraint Environment) is a framework for the construction of distributed constraint solvers from software components in a number of predefined categories. The...
Peter Zoeteweij
148
Voted
IPPS
2010
IEEE
15 years 1 months ago
Speculative execution on multi-GPU systems
Abstract--The lag of parallel programming models and languages behind the advance of heterogeneous many-core processors has left a gap between the computational capability of moder...
Gregory F. Diamos, Sudhakar Yalamanchili