We study how several collective operations like broadcast, reduction, scan, etc. can be composed efficiently in complex parallel programs. Our specific contributions are: (1) a fo...
Sergei Gorlatch, Christoph Wedler, Christian Lenga...
This paper describes the design, implementation, and performance of a Java framework for supporting a style of parallel programming in which problems are solved by (recursively) s...
T-system is a tool for parallel computing developed at the PSI RAS. The most recent implementation is available on both Linux and Windows platforms. The paper is dedicated to one o...
Alexander Moskovsky, Vladimir Roganov, Sergey Abra...
Abstract. The availability of commodity multiprocessors offers significant opportunities for addressing the increasing computational requirements of optimization applications. To...
We present a set of advanced program parallelization techniques that are able to signi cantly improve the performance of application programs. We present evidence for this improve...