We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of searchbased performance optimizatio...
Samuel Williams, Jonathan Carter, Leonid Oliker, J...
This paper describes our early experiences with a preproduction Cray XMT system that implements a scalable shared memory architecture with hardware support for multithreading. Unl...
Many scientific applications need to stage large volumes of files from one set of machines to another set of machines in a wide-area network. Efficient execution of such data t...
Web server scalability can be greatly enhanced via hybrid data dissemination methods that use both unicast and multicast. Hybrid data dissemination is particularly promising due t...
Wenhui Zhang, Vincenzo Liberatore, Jonathan Beaver...
High performance computing is critical for financial markets where analysts seek to accelerate complex optimizations such as pricing engines to maintain a competitive edge. In th...