Pipelining has been used in the design of many PRAM algorithms to reduce their asymptotic running time. Paul, Vishkin, and Wagener (PVW) used the approach in a parallel implementat...
We aimed to study the performance of a parallel implementation of an intraoperative nonrigid registration algorithm that accurately simulates the biomechanical properties of the b...
Simon K. Warfield, Matthieu Ferrant, Xavier Gallez...
Abstract. This article presents the C++ library vShark which reduces the intranode communication overhead of parallel programs on clusters of SMPs. The library is built on top of m...
This paper presents new achievements on the automatic mapping of algorithms, written in imperative software programming languages, to custom computing machines. The reconfigurable ...
Recursive Diagonal Torus (RDT) is a class of interconnection network consisting of recursively overlaid twodimensional square diagonal tori for massively parallel computers with u...