Sciweavers

223
Voted
ASPLOS
2009
ACM
16 years 1 months ago
Kendo: efficient deterministic multithreading in software
Although chip-multiprocessors have become the industry standard, developing parallel applications that target them remains a daunting task. Non-determinism, inherent in threaded a...
Marek Olszewski, Jason Ansel, Saman P. Amarasinghe
ASPLOS
2009
ACM
16 years 1 months ago
3D finite difference computation on GPUs using CUDA
In this paper we describe a GPU parallelization of the 3D finite difference computation using CUDA. Data access redundancy is used as the metric to determine the optimal implement...
Paulius Micikevicius
223
Voted
ASPLOS
2009
ACM
16 years 1 months ago
QR decomposition on GPUs
QR decomposition is a computationally intensive linear algebra operation that factors a matrix A into the product of a unitary matrix Q and upper triangular matrix R. Adaptive sys...
Andrew Kerr, Dan Campbell, Mark Richards
183
Voted
ASPLOS
2009
ACM
16 years 1 months ago
Maximum benefit from a minimal HTM
A minimal, bounded hardware transactional memory implementation significantly improves synchronization performance when used in an operating system kernel. We add HTM to Linux 2.4...
Owen S. Hofmann, Christopher J. Rossbach, Emmett W...
ASPLOS
2009
ACM
15 years 4 months ago
Accelerating phase unwrapping and affine transformations for optical quadrature microscopy using CUDA
Optical Quadrature Microscopy (OQM) is a process which uses phase data to capture information about the sample being studied. OQM is part of an imaging framework developed by the ...
Perhaad Mistry, Sherman Braganza, David R. Kaeli, ...
Programming Languages
Top of PageReset Settings