This paper presents results on a new approach to partitioning a modulo-scheduled loop for distributed execution on parallel clusters of functional units organized as a VLIW machin...
Marcio Merino Fernandes, Josep Llosa, Nigel P. Top...
level of abstraction, compared with the program representation for scalar optimizations. For example, loop unrolling and loop unrolland-jam transformations exploit the large regist...
Rakesh Krishnaiyer, Dattatraya Kulkarni, Daniel M....
The Discrete Wavelet Transform (DWT) forms the core of the JPEG2000 image compression algorithm. Since the JPEG2000 compression application is heavily data-intensive, the overall ...
We introduce a new register file architecture that provides both row-wise and column-wise accesses, thus allowing partitioned instructions to be used in columnwise processing with...
Yoochang Jung, Stefan G. Berg, Donglok Kim, Yongmi...
High-Level Synthesis (HLS) is the field of transforming a high-level programming language, such as C, into a register transfer level(RTL) description of the design. In HLS, Binary...