Abstract This paper proposes a new fine-grained data distribution operation MPI Alltoall specific that allows an element-wise distribution of data elements to specific target pro...
Because of its excellent bit-error-rate performance, the Low-Density Parity-Check (LDPC) algorithm is gaining increased attention in communication standards and literature. The ne...
In this work we evaluate the implementation of a video encoder based on the 3D Wavelet Transform optimized for HyperThreading technology and SMPs. We design several implementation...
VLIW machines possibly provide the most direct way to exploit instruction level parallelism; however, they cannot be used to emulate current general-purpose instruction set archit...
Parallel processing networks, even full crossbars, that only implement point-to-point and multicast message passing are inefficient for collective communications because multiple ...