This paper presents a reconfigurable processor designed to execute user-defined block-matching motion estimation algorithms, and a toolset for the design of such algorithms and ...
Trevor Spiteri, George Vafiadis, Jose Luis Nunez-Y...
— Small gates, such as AND2, XOR2 and MUX2, have been mixed with lookup tables (LUTs) inside the programmable logic block (PLB) to reduce area and power and increase performance ...
Distributed local memories, or scratchpads, have been shown to effectively reduce cost and power consumption of application-specific accelerators while maintaining performance. Th...
Manjunath Kudlur, Kevin Fan, Michael L. Chu, Scott...
Modern chip multiprocessors (CMPs) are designed to exploit both instruction-level parallelism (ILP) within processors and thread-level parallelism (TLP) within and across processo...
Changkyu Kim, Simha Sethumadhavan, M. S. Govindan,...
— Memory-intensive applications present unique challenges to an ASIC designer in terms of the choice of memory organization, memory size requirements, bandwidth and access latenc...