Existing dynamic race detectors suffer from at least one of the following three limitations: (i) space overhead per memory location grows linearly with the number of parallel thre...
While graphics processing units (GPUs) provide low-cost and efficient platforms for accelerating high performance computations, the tedious process of performance tuning required...
Mehrzad Samadi, Amir Hormati, Mojtaba Mehrara, Jan...
Software-based thread-level parallelization has been widely studied for exploiting data parallelism in purely computational loops to improve program performance on multiprocessors...
Modularity is a central theme in any scalable program analysis. The core idea in a modular analysis is to build summaries at procedure boundaries, and use the summary of a procedu...
Aws Albarghouthi, Rahul Kumar, Aditya V. Nori, Sri...
— Accurate query performance prediction (QPP) is central to effective resource management, query optimization and query scheduling. Analytical cost models, used in current genera...