One way to exploit Thread Level Parallelism (TLP) is to use architectures that implement novel multithreaded execution models, like Scheduled DataFlow (SDF). This latter model pro...
Improving memory performance at software level is more effective in reducing the rapidly expanding gap between processor and memory performance. Loop transformations (e.g. loop un...
Surendra Byna, Xian-He Sun, William Gropp, Rajeev ...
Grid Portals, based on standard web technologies, are emerging as important and useful user interfaces to computational and data Grids. Grid Portals enable Virtual Organizations, c...
Michael Russell, Gabrielle Allen, Greg Daues, Ian ...
JuxtaView is a cluster-based application for viewing ultra-high-resolution images on scalable tiled displays. We present in JuxtaView, a new parallel computing and distributed mem...
Naveen K. Krishnaprasad, Venkatram Vishwanath, Sha...
Using multi-GPU systems, including GPU clusters, is gaining popularity in scientific computing. However, when using multiple GPUs concurrently, the conventional data parallel GPU...