Many parallel applications from scientific computing use MPI collective communication operations to collect or distribute data. Since the execution times of these communication op...
Thread-Level Speculation (TLS) allows us to automatically parallelize general-purpose programs by supporting parallel execution of threads that might not actually be independent. ...
J. Gregory Steffan, Christopher B. Colohan, Antoni...
As on-chip integration matures, single-chip system designers must not only be concerned with component-level issues such as performance and power, but also with onchip system-leve...
This paper presents a new LogP-based model, called LoOgGP, which allows an accurate characterization of MPI applications based on microbenchmark measurements. This new model is an...
Synchronization operations, such as fence and locking, are used in many parallel operations accessing shared memory. However, a process which is blocked waiting for a fence operat...
Darius Buntinas, Amina Saify, Dhabaleswar K. Panda...