Software pipelining is a loop optimization that overlaps the execution of several iterations of a loop to expose more instruction-level parallelism. It can result in first-class p...
A coarse-grain multithreaded processor can effectively hide long memory latencies by quickly switching to an alternate task when the active task issues a memory request, improving...
A dynamic binary translation system for a co-designed virtual machine is described and evaluated. The underlying hardware directly executes an accumulator-oriented instruction set...
This paper presents a hardware-based dynamic optimizer that continuously optimizes an application’s instruction stream. In continuous optimization, dataflow optimizations are p...
Brian Fahs, Todd M. Rafacz, Sanjay J. Patel, Steve...
Our program benchmarks and simulations of novel circuits indicate that large-window processors are feasible. Using our redesigned superscalar components, a large-window processor ...
Dana S. Henry, Bradley C. Kuszmaul, Gabriel H. Loh...