Sciweavers

135
Voted
PPOPP
2009
ACM
16 years 29 days ago
OpenMP to GPGPU: a compiler framework for automatic translation and optimization
GPGPUs have recently emerged as powerful vehicles for generalpurpose high-performance computing. Although a new Compute Unified Device Architecture (CUDA) programming model from N...
Seyong Lee, Seung-Jai Min, Rudolf Eigenmann
103
Voted
PPOPP
2009
ACM
16 years 29 days ago
Idempotent work stealing
Load balancing is a technique which allows efficient parallelization of irregular workloads, and a key component of many applications and parallelizing runtimes. Work-stealing is ...
Maged M. Michael, Martin T. Vechev, Vijay A. Saras...
PPOPP
2009
ACM
16 years 29 days ago
Efficient and scalable multiprocessor fair scheduling using distributed weighted round-robin
Fairness is an essential requirement of any operating system scheduler. Unfortunately, existing fair scheduling algorithms are either inaccurate or inefficient and non-scalable fo...
Tong Li, Dan P. Baumberger, Scott Hahn
PPOPP
2009
ACM
16 years 29 days ago
A comprehensive strategy for contention management in software transactional memory
In Software Transactional Memory (STM), contention management refers to the mechanisms used to ensure forward progress-to avoid livelock and starvation, and to promote throughput ...
Michael F. Spear, Luke Dalessandro, Virendra J. Ma...
PPOPP
2009
ACM
16 years 29 days ago
Formal verification of practical MPI programs
This paper considers the problem of formal verification of MPI programs operating under a fixed test harness for safety properties without building verification models. In our app...
Anh Vo, Sarvani S. Vakkalanka, Michael Delisi, Gan...
Distributed And Parallel Computing
Top of PageReset Settings