Pre-execution techniques have received much attention as an effective way of prefetching cache blocks to tolerate the everincreasing memory latency. A number of pre-execution tech...
Dongkeun Kim, Shih-Wei Liao, Perry H. Wang, Juan d...
Multicore designs have emerged as the mainstream design paradigm for the microprocessor industry. Unfortunately, providing multiple cores does not directly translate into performa...
Mojtaba Mehrara, Jeff Hao, Po-Chun Hsu, Scott A. M...
Abstract. In order to take full advantage of high-end computing platforms, scientific applications often require modifications to source codes, and to their build systems that ge...
Cache locality optimization is an efficient way for reducing the idle time of modern processors in waiting for needed data. This kind of optimization can be achieved either on the...
abstraction for modeling these problems is to view the Web as a collection of (usually small and heterogeneous) databases, and to view programs that extract and process Web data au...