Complex shaders must be partitioned into multiple passes to execute on GPUs with limited hardware resources. Automatic partitioning gives rise to an NP-hard scheduling problem tha...
This paper represents a design study of the datapath for a very long instruction word (VLIW) video signal processor (VSP). VLIW architectures provide high parallelism and excellen...
Andrew Wolfe, Jason Fritts, Santanu Dutta, Edil S....
Growing concerns about power have revived interest in in-order pipelines. In-order pipelines sacrifice single-thread performance. Specifically, they do not allow execution to flow...
It is often possible to greatly improve the performance of a hardware system via the use of predictive (speculative) techniques. For example, the performance of out-of-order micro...
Alan Fern, Robert Givan, Babak Falsafi, T. N. Vija...