We present experimental evidence that multiple compute-units, compiled from sequential high-level language input programs, can be merged into a reduced number of configurations f...
Abstract. The paradigm shift in processor design from monolithic processors to multicore has renewed interest in programming models that facilitate parallelism. While multicores ar...
Shan Shan Huang, Amir Hormati, David F. Bacon, Rod...
This paper describes a lightweight Field Programmable Gate Array (FPGA) circuit design that supports the simultaneous programming of multiple devices at different locations throug...
In this paper, we describe a set of compiler analyses and an implementation that automatically map a sequential and un-annotated C program into a pipelined implementation, targete...
— In this paper we investigate mapping stream programs (i.e., programs written in a streaming style for streaming architectures such as Imagine and Raw) onto a general-purpose CP...