Massively parallel SIMD array architectures are making their way into embedded processors. In these architectures, a number of identical processing elements having small private st...
Anton Lokhmotov, Benedict R. Gaster, Alan Mycroft,...
SLEEF (SIMD Library for Evaluating Elementary Functions) is a library that facilitates programming with SIMD instructions. It implements the trigonometric functions, inverse trigon...
We present a compiler internal program optimization that uses graph rewriting. This optimization enables the compiler to automatically use rich instructions (such as SIMD instructi...
The latency of broadcast/reduction operations has a significant impact on the performance of SIMD processors. This is especially true for associative programs, which make extensiv...