Sciweavers

1624 search results - page 286 / 325
» Distributed Threads in Java
Sort
View
PPOPP
2011
ACM
14 years 14 days ago
GRace: a low-overhead mechanism for detecting data races in GPU programs
In recent years, GPUs have emerged as an extremely cost-effective means for achieving high performance. Many application developers, including those with no prior parallel program...
Mai Zheng, Vignesh T. Ravi, Feng Qin, Gagan Agrawa...
CLUSTER
2011
IEEE
13 years 9 months ago
Performance Characterization and Optimization of Atomic Operations on AMD GPUs
—Atomic operations are important building blocks in supporting general-purpose computing on graphics processing units (GPUs). For instance, they can be used to coordinate executi...
Marwa Elteir, Heshan Lin, Wu-chun Feng
EUROPAR
2011
Springer
13 years 9 months ago
A Bit-Compatible Parallelization for ILU(k) Preconditioning
Abstract. ILU(k) is a commonly used preconditioner for iterative linear solvers for sparse, non-symmetric systems. It is often preferred for the sake of its stability. We present T...
Xin Dong 0004, Gene Cooperman
PPOPP
2012
ACM
13 years 5 months ago
PARRAY: a unifying array representation for heterogeneous parallelism
This paper introduces a programming interface called PARRAY (or Parallelizing ARRAYs) that supports system-level succinct programming for heterogeneous parallel systems like GPU c...
Yifeng Chen, Xiang Cui, Hong Mei
PPOPP
2012
ACM
13 years 5 months ago
Better speedups using simpler parallel programming for graph connectivity and biconnectivity
Speedups demonstrated for finding the biconnected components of a graph: 9x to 33x on the Explicit Multi-Threading (XMT) many-core computing platform relative to the best serial ...
James A. Edwards, Uzi Vishkin