In this paper we employ two techniques suitable for embedded media processors. The first technique, extended subwords, uses four extra bits for every byte in a media register. Th...
Asadollah Shahbahrami, Ben H. H. Juurlink, Stamati...
Loop vectorization, a key feature exploited to obtain high performance on Single Instruction Multiple Data (SIMD) vector architectures, is significantly hindered by irregular memo...
Byunghyun Jang, Perhaad Mistry, Dana Schaa, Rodrig...