Achieving high performance for concurrent applications on modern multiprocessors remains challenging. Many programmers avoid locking to improve performance, while others replace l...
Thomas E. Hart, Paul E. McKenney, Angela Demke Bro...
We present a novel mechanism, called meeting point thread characterization, to dynamically detect critical threads in a parallel region. We define the critical thread the one with...
Targeted optimization of program segments can provide an additional program speedup over the highest default optimization level, such as -O3 in GCC. The key challenge is how to au...
Haiping Wu, Eunjung Park, Mihailo Kaplarevic, Ying...
High performance intra-node communication support for MPI applications is critical for achieving best performance from clusters of SMP workstations. Present day MPI stacks cannot ...
Hyun-Wook Jin, Sayantan Sur, Lei Chai, Dhabaleswar...
We assert that in order to perform well, a shared-memory multiprocessorinter-process communication (IPC)facility mustavoid a) accessing any shared data, and b) acquiring any locks...