Sciweavers

PPOPP
2009
ACM
14 years 5 months ago
Detecting and tolerating asymmetric races
Because data races represent a hard-to-manage class of errors in concurrent programs, numerous approaches to detect them have been proposed and evaluated. We specifically consider...
Paruj Ratanaworabhan, Martin Burtscher, Darko Kiro...
PPOPP
2009
ACM
14 years 5 months ago
Comparability graph coloring for optimizing utilization of stream register files in stream processors
A stream processor executes an application that has been decomposed into a sequence of kernels that operate on streams of data elements. During the execution of a kernel, all stre...
Xuejun Yang, Li Wang, Jingling Xue, Yu Deng, Ying ...
PPOPP
2009
ACM
14 years 5 months ago
OpenMP to GPGPU: a compiler framework for automatic translation and optimization
GPGPUs have recently emerged as powerful vehicles for generalpurpose high-performance computing. Although a new Compute Unified Device Architecture (CUDA) programming model from N...
Seyong Lee, Seung-Jai Min, Rudolf Eigenmann
PPOPP
2009
ACM
14 years 5 months ago
Parallelization spectroscopy: analysis of thread-level parallelism in hpc programs
In this paper, we present a thorough analysis of thread-level parallelism available in production High Performance Computing (HPC) codes. We survey a number of techniques that are...
Arun Kejariwal, Calin Cascaval
PPOPP
2009
ACM
14 years 5 months ago
Software transactional distributed shared memory
We have developed a transaction-based approach to distributed shared memory(DSM) that supports object caching and generates path expression prefetches. A path expression specifies...
Alokika Dash, Brian Demsky
PPOPP
2009
ACM
14 years 5 months ago
Serialization sets: a dynamic dependence-based parallel execution model
This paper proposes a new parallel execution model where programmers augment a sequential program with pieces of code called serializers that dynamically map computational operati...
Matthew D. Allen, Srinath Sridharan, Gurindar S. S...
PPOPP
2009
ACM
14 years 5 months ago
Atomic quake: using transactional memory in an interactive multiplayer game server
Transactional Memory (TM) is being studied widely as a new technique for synchronizing concurrent accesses to shared memory data structures for use in multi-core systems. Much of ...
Adrián Cristal, Eduard Ayguadé, Fera...
PPOPP
2009
ACM
14 years 5 months ago
Mapping parallelism to multi-cores: a machine learning based approach
The efficient mapping of program parallelism to multi-core processors is highly dependent on the underlying architecture. This paper proposes a portable and automatic compiler-bas...
Zheng Wang, Michael F. P. O'Boyle
PPOPP
2009
ACM
14 years 5 months ago
A compiler-directed data prefetching scheme for chip multiprocessors
Data prefetching has been widely used in the past as a technique for hiding memory access latencies. However, data prefetching in multi-threaded applications running on chip multi...
Dhruva Chakrabarti, Mahmut T. Kandemir, Mustafa Ka...