There has been much work recently on improving the locality performance of loop nests in scientific programs through the use of loop as well as data layout optimizations. However,...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
Resource management on accelerator based systems is complicated by the disjoint nature of the main CPU and accelerator, which involves separate memory hierarhcies, different degr...
Filip Blagojevic, Costin Iancu, Katherine A. Yelic...
Abstract. We describe the design and implementation of the Distributed ObjectOriented Threads System (DOTS). This system is a complete redesign of the Distributed Threads System (D...
In this paper we present a multi-GPU parallel volume rendering implemention built using the MapReduce programming model. We give implementation details of the library, including s...
Jeff A. Stuart, Cheng-Kai Chen, Kwan-Liu Ma, John ...
Among the advantages of visual dataflow programming is that it can give the user a sense of location for the data in a computation. This can help novices build a mental picture of...