Sciweavers

1624 search results - page 286 / 325
» Distributed Threads in Java
Sort
View
PPOPP
2011
ACM
14 years 6 months ago
GRace: a low-overhead mechanism for detecting data races in GPU programs
In recent years, GPUs have emerged as an extremely cost-effective means for achieving high performance. Many application developers, including those with no prior parallel program...
Mai Zheng, Vignesh T. Ravi, Feng Qin, Gagan Agrawa...
CLUSTER
2011
IEEE
14 years 3 months ago
Performance Characterization and Optimization of Atomic Operations on AMD GPUs
—Atomic operations are important building blocks in supporting general-purpose computing on graphics processing units (GPUs). For instance, they can be used to coordinate executi...
Marwa Elteir, Heshan Lin, Wu-chun Feng
EUROPAR
2011
Springer
14 years 3 months ago
A Bit-Compatible Parallelization for ILU(k) Preconditioning
Abstract. ILU(k) is a commonly used preconditioner for iterative linear solvers for sparse, non-symmetric systems. It is often preferred for the sake of its stability. We present T...
Xin Dong 0004, Gene Cooperman
PPOPP
2012
ACM
13 years 11 months ago
PARRAY: a unifying array representation for heterogeneous parallelism
This paper introduces a programming interface called PARRAY (or Parallelizing ARRAYs) that supports system-level succinct programming for heterogeneous parallel systems like GPU c...
Yifeng Chen, Xiang Cui, Hong Mei
PPOPP
2012
ACM
13 years 11 months ago
Better speedups using simpler parallel programming for graph connectivity and biconnectivity
Speedups demonstrated for finding the biconnected components of a graph: 9x to 33x on the Explicit Multi-Threading (XMT) many-core computing platform relative to the best serial ...
James A. Edwards, Uzi Vishkin