Abstract—We present a throughput-driven partitioning algorithm and a throughput-preserving merging algorithm for the high-level physical synthesis of latency-insensitive (LI) sys...
Software pipelining and unfolding are commonly used techniques to increase parallelism for DSP applications. However, these techniques expand the code size of the application sign...
Bin Xiao, Zili Shao, Chantana Chantrapornchai, Edw...
Data parallel programs are sensitive to the distribution of data across processor nodes. We formulate the reduction of inter-node communication as an optimization on a colored gra...
We evaluate the impact of a gigabit network on the implementation of a distributed chemical process optimization application. The optimization problem is formulated as a stochasti...
Consider a parallel program with n processes and a synchronization granularity z. Consider also two multiprocessors: a multiprocessor with q processors and run-time reallocation o...
Lars Lundberg, Kamilla Klonowska, Magnus Broberg, ...