Parallelizing compiler technology has improved in recent years. One area in which compilers have made progress is in handling DOACROSS loops, where crossprocessor data dependencie...
Current integration trends embrace the prosperity of single-chip multi-core processors. Although multi-core processors deliver significantly improved system throughput, single-thr...
—Multicore machines are becoming common. There are many languages, language extensions and libraries devoted to improve the programmability and performance of these machines. In ...
Diego Andrade, Basilio B. Fraguela, James C. Brodm...
In this paper we explore the impact of the block shape on blocked and vectorized versions of the Sparse Matrix-Vector Multiplication (SpMV) kernel and build upon previous work by ...
Vasileios Karakasis, Georgios I. Goumas, Nectarios...
In this contribution we introduce a low-complexity bit-parallel algorithm for computing square roots over binary extension fields. Our proposed method can be applied for any type ...