As the industry moves toward larger-scale chip multiprocessors, the need to parallelize applications grows. High inter-thread communication delays, exacerbated by over-stressed hi...
Ram Rangan, Neil Vachharajani, Adam Stoler, Guilhe...
The time it takes to reconfigure FPGAs can be a significant overhead for reconfigurable computing. In this paper we develop new compression algorithms for FPGA configurations that...
Current practice in the design of application software for high-performance embedded computing systems is characterized by long development times, lack of interoperability with ot...
Conditional branch induced control hazards cause significant performance loss in modern out-of-order superscalar processors. Dynamic branch prediction techniques help alleviate th...
Most commercial and academic floating point libraries for FPGAs provide only a small fraction of all possible floating point units. In contrast, the floating point unit generat...