Sciweavers

583 search results - page 27 / 117
» NAS Parallel Benchmark Results
Sort
View
PPOPP
2012
ACM
13 years 7 months ago
Internally deterministic parallel algorithms can be fast
The virtues of deterministic parallelism have been argued for decades and many forms of deterministic parallelism have been described and analyzed. Here we are concerned with one ...
Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gib...
IPPS
2005
IEEE
15 years 5 months ago
Runtime Empirical Selection of Loop Schedulers on Hyperthreaded SMPs
Hyperthreaded (HT) and simultaneous multithreaded (SMT) processors are now available in commodity workstations and servers. This technology is designed to increase throughput by e...
Yun Zhang, Michael Voss
CLUSTER
2008
IEEE
15 years 1 months ago
Efficient one-copy MPI shared memory communication in Virtual Machines
Efficient intra-node shared memory communication is important for High Performance Computing (HPC), especially with the emergence of multi-core architectures. As clusters continue ...
Wei Huang, Matthew J. Koop, Dhabaleswar K. Panda
PPOPP
2003
ACM
15 years 5 months ago
Using thread-level speculation to simplify manual parallelization
In this paper, we provide examples of how thread-level speculation (TLS) simplifies manual parallelization and enhances its performance. A number of techniques for manual parallel...
Manohar K. Prabhu, Kunle Olukotun
ICS
2009
Tsinghua U.
15 years 6 months ago
MPI-aware compiler optimizations for improving communication-computation overlap
Several existing compiler transformations can help improve communication-computation overlap in MPI applications. However, traditional compilers treat calls to the MPI library as ...
Anthony Danalis, Lori L. Pollock, D. Martin Swany,...