Instruction-set extensible processors allow an existing processor core to be extended with application-specific custom instructions. In this paper, we explore a novel application...
Basic retiming is an algorithm originally developed for hardware optimization. Software pipelining is a technique proposed to increase instruction-level parallelism for parallel p...
The memory subsystem accounts for a significant portion of the aggregate energy budget of contemporary embedded systems. Moreover, there exists a large potential for optimizing th...
As more complex DSP algorithms are realized in practice, an increasing need for high-level stream abstractions that can be compiled without sacrificing efficiency. Toward this en...
Andrew A. Lamb, William Thies, Saman P. Amarasingh...
Memory systems consume a significant portion of power in handheld embedded systems. So far, low-power memory techniques have addressed the power consumption when the system is tu...