Morphological attribute filters have not previously been parallelized mainly because they are both global and nonseparable. We propose a parallel algorithm that achieves efficient ...
Michael H. F. Wilkinson, Hui Gao, Wim H. Hesselink...
In this paper, we describe a generalized approach to deriving a custom data layout in multiple memory banks for array-based computations, to facilitate high-bandwidth parallel mem...
:. In designing application-specific bit-level architectures and in programming existing bit-level processor arrays, it is necessary to expand a word-level algorithm into its bit-...
Several challenges about computational grid exist in integrating, coordinating and managing of resources and scheduling of applications, due to distributed resources at various le...
– Efficient implementations of the Discrete Fourier Transform (DFT) for GPUs provide good performance with large data sizes, but are not competitive with CPU code for small data ...