Multiple clock cycles are needed to cross the global interconnects for multi-gigahertz designs in nanometer technologies. For synchronous designs, this requires retiming and pipel...
To meet the conflicting goals of high-performance low-cost embedded systems, critical application loop nests are commonly executed on specialized hardware accelerators. These loop...
Kevin Fan, Manjunath Kudlur, Hyunchul Park, Scott ...
Architectural decisions for DSP modules are often analyzed using high level C models. Such high-level explorations allow early examination of the algorithms and the architectural ...
In this paper, we propose an efficient technique for run-time application mapping onto Network-on-Chip (NoC) platforms with multiple voltage levels. Our technique consists of a re...
We present the application of hardware accelerated volume rendering algorithms to the simulation of radiographs as an aid to scientists designing experiments, validating simulatio...