The use of pipelined floating-point arithmetic cores to create high-performance FPGA-based computational kernels has introduced a new class of problems that do not exist when usi...
Control intensive scalar programs pose a very different challenge to highly pipelined supercomputers than vectorizable numeric applications. Function call/return and branch instru...
This paper presents a hardware-based dynamic optimizer that continuously optimizes an application’s instruction stream. In continuous optimization, dataflow optimizations are p...
Brian Fahs, Todd M. Rafacz, Sanjay J. Patel, Steve...
In this paper we propose a design technique to pipeline cache memories for high bandwidth applications. With the scaling of technology cache access latencies are multiple clock cy...
-- In this paper, a tool to aid pipelined processor instruction set implementation is described. The purpose of the tool is to choose from among design alternatives a design that m...