Instruction set customization is an effective way to improve processor performance. Critical portions of application dataflow graphs are collapsed for accelerated execution on s...
Nathan Clark, Jason A. Blome, Michael L. Chu, Scot...
This paper describes a methodology for using the matrix-vector multiply and scan conversion hardware present in many graphics workstations to rapidly approximate the optical flow ...
Dan S. Wallach, Sharma Kunapalli, Michael F. Cohen
In this paper, we present the application of dynamic trellis diagrams (DTDs) to automatic translation of data flow graphs (DFGs) into highly optimized programs for digital signal ...
Recent studies on program execution behavior reveal that a large amount of execution time is spent in small frequently executed regions of code. Whereas adaptive cache management ...
Ravikrishnan Sree, Alex Settle, Ian Bratt, Daniel ...
—While many-core accelerator architectures, such as today’s Graphics Processing Units (GPUs), offer orders of magnitude more raw computing power than contemporary CPUs, their m...
Aaron Ariel, Wilson W. L. Fung, Andrew E. Turner, ...