Short vector (SIMD) instructions are useful in signal processing, multimedia, and scientific applications. They offer higher performance, lower energy consumption, and better res...
General-purpose microprocessors augmented with SIMD execution units enhance multimedia applications by exploiting data level parallelism. However, supporting/overhead related inst...
We introduce another view of group theory in the field of interconnection networks. With this approach it is possible to specify application specific network topologies for permut...
The widespread presence of SIMD devices in today’s microprocessors has made compiler techniques for these devices tremendously important. One of the most important and difficul...
Abstract. This paper introduces a method to generate efficient vectorized implementations of small stride permutations using only vector load and vector shuffle instructions. These...