In this paper we describe a GPU parallelization of the 3D finite difference computation using CUDA. Data access redundancy is used as the metric to determine the optimal implement...
Abstract— In this paper, we describe a system for transforming a code given in ANSI C into an equivalent SystemC description. In order to synthesize parallel C codes into hardwar...
Piotr Dziurzanski, W. Bielecki, Konrad Trifunovic,...
Abstract. FAST-EVP is a simulation tool for internal combustion engines running on cluster platforms; it has evolved from the KIVA-3V code base, but has been extensively rewritten ...
Gino Bella, Alfredo Buttari, Alessandro De Maio, F...
Multicast is an important collective operation for parallel programs. Some Network Interface Cards (NICs), such as Myrinet, have programmable processors that can be programmed to ...
Parallel programming models should attempt to satisfy two conflicting goals. On one hand, they should hide architectural details so that algorithm designers can write simple, port...
Brian Grayson, Michael Dahlin, Vijaya Ramachandran