Given the large communication overheads characteristic of modern parallel machines, optimizations that eliminate, hide or parallelize communication may improve the performance of ...
Abstract. Understanding program behavior is at the foundation of program optimization. Techniques for automatic recognition of program constructs characterize the behavior of code ...
Thread-level speculation is a technique that brings thread-level parallelism beyond the data-flow limit by executing a piece of code ahead of time speculatively before all its inp...
Xiao-Feng Li, Zhao-Hui Du, Chen Yang, Chu-Cheow Li...
Multicore designs have emerged as the mainstream design paradigm for the microprocessor industry. Unfortunately, providing multiple cores does not directly translate into performa...
Mojtaba Mehrara, Jeff Hao, Po-Chun Hsu, Scott A. M...
Abstract—Recent years have seen a trend in using graphic processing units (GPU) as accelerators for general-purpose computing. The inexpensive, single-chip, massively parallel ar...