Fine-grained accelerators have the potential to deliver significant benefits in various platforms for embedded signal processing. Due to the moderate complexity of their targeted o...
Jani Boutellier, Shuvra S. Bhattacharyya, Olli Sil...
We describe and evaluate a new, pipelined algorithm for large, irregular all-gather problems. In the irregular all-gather problem each process in a set of processes contributes in...
A typical hardware development flow starts the verification process concurrently with RTL, but the overall schedule becomes limited by the effort required to complete all the nece...
This paper shows how to software pipeline a loop for minimal register pressure withoutsacrificing the loop’s minimum execution time. This novel bidirectional slack-scheduling m...