The graphics processor (GPU) has evolved into an appealing choice for high performance computing due to its superior memory bandwidth, raw processing power, and flexible programm...
Kyle Spafford, Jeremy S. Meredith, Jeffrey S. Vett...
This paper describes a methodology for synthesizing the data-path of a Very Long Instruction Word (VLIW) based Video Signal Processor (VSP). Offering both performance and programm...
— SuperMatrix out-of-order scheduling leverages el abstractions and straightforward data dependency analysis to provide a general-purpose mechanism for obtaining parallelism from...
Ernie Chan, Field G. Van Zee, Enrique S. Quintana-...
Abstract. Modern parallel and distributed computing solutions are often built onto a “middleware” software layer providing a higher and common level of service between computat...
Emanuele Di Saverio, Marco Cesati, Christian Di Bi...
In the sub-micron technology era, wire delays are becoming much more important than gate delays, making it particularly attractive to go for clustered designs. A common form of cl...