Sciweavers

7 search results - page 2 / 2
» Performance Improvement of Multimedia Kernels by Alleviating...
Sort
View
CF
2005
ACM
13 years 7 months ago
Matrix register file and extended subwords: two techniques for embedded media processors
In this paper we employ two techniques suitable for embedded media processors. The first technique, extended subwords, uses four extra bits for every byte in a media register. Th...
Asadollah Shahbahrami, Ben H. H. Juurlink, Stamati...
PPOPP
2010
ACM
14 years 2 months ago
Data transformations enabling loop vectorization on multithreaded data parallel architectures
Loop vectorization, a key feature exploited to obtain high performance on Single Instruction Multiple Data (SIMD) vector architectures, is significantly hindered by irregular memo...
Byunghyun Jang, Perhaad Mistry, Dana Schaa, Rodrig...