Recent advances in FPGA technologies allow to configure the RAM-based FPGA devices in a reduced time as an effective support for real-time applications. The physical dimensions of ...
The increasing size and complexity of high-performance applications have motivated a new round of innovation related to configuration, build, and launch of applications for large ...
This work deals with the scheduling problem of a directed acyclic graph with interprocessor communication delays. The objective is to minimize the makespan, taking into account the...
Accurate performance predictions are difficult to achieve for parallel applications executing on production distributed systems. Conventional point-valued performance parameters a...
Vector prefix and reduction are collective communication primitives in which all processors must cooperate. We present two parallel algorithms, the direct algorithm and the split ...