Tuning a configurable cache subsystem to an application can greatly reduce memory hierarchy energy consumption. Previous tuning methods use a level one configurable cache only, or...
— Architectural resources and program recurrences are the main limitations to the amount of Instruction-Level Parallelism (ILP) exploitable from loops, the most time-consuming pa...
While performance, area, and power constraints have been the driving force in designing current communication-enabled embedded systems, post-fabrication and run-time adaptability ...
With the increased clock frequency of modern, high-performance processors over 500 MHz, in some cases, limiting the power dissipation has become the most stringent design target. ...
Luca Benini, Giovanni De Micheli, Alberto Macii, E...
State-of-the-art integral-equation-based solvers rely on techniques that can perform a matrix-vector multiplication in O(N) complexity. In this work, a fast inverse of linear comp...