Targeted optimization of program segments can provide an additional program speedup over the highest default optimization level, such as -O3 in GCC. The key challenge is how to au...
Haiping Wu, Eunjung Park, Mihailo Kaplarevic, Ying...
We propose and evaluate a novel approach for automatic parallelization. The approach uses traces as units of parallel work. We discuss the benefits and challenges of the use of t...
We demonstrate the bene ts of instruction-set simulation in the evaluation of a parallel programming system, Penny. The simulator is a reliable tool in exploring design alternativ...
In this study, we introduce an evaluation methodology for advanced memory systems. This methodology is based on statistical factorial analysis. It is two fold: it first determines...
Xian-He Sun, Dongmei He, Kirk W. Cameron, Yong Luo
Multiprocessors are now commonplace, and cloud computing is swiftly following suit. While it is possible to write high performance code for these systems, concurrency bugs are ext...