Matrix transpose in parallel systems typically involves costly all-to-all communications. In this paper, we provide a comparative characterization of various efficient algorithms f...
Diverse IP cores are integrated on a modern system-on-chip and share resources. Off-chip memory bandwidth is often the scarcest resource and requires careful allocation. Two of t...
Min Kyu Jeong, Mattan Erez, Chander Sudanthi, Nige...
This paper presents several novel theoretical results regarding the recovery of a low-rank matrix from just a few measurements consisting of linear combinations of the matrix entr...
We propose a technique for reducing the energy spent in the memory-processor interface of an embedded system during the execution of firmware code. The method is based on the idea ...
Luca Benini, Alberto Macii, Enrico Macii, Massimo ...