Data decomposition is probably the most successful method for generating parallel programs. In this paper a general framework is described for the automatic generation of parallel...
Edwin M. R. M. Paalvast, Henk J. Sips, Arjan J. C....
Configurable on chip multiprocessor systems combine advantages of task-level parallelism and the flexibility of field-programmable devices to customize architectures for paralle...
Harold Ishebabi, Philipp Mahr, Christophe Bobda, M...
Data locality is critical to achievinghigh performance on large-scale parallel machines. Non-local data accesses result in communication that can greatly impact performance. Thus ...
One of the most fundamental problems automatic parallelization tools are confronted with is to find an optimal domain decomposition for a given application. For regular domain prob...
The data distribution problem is very complex, because it involves trade-offdecisions between minimizing communication and maximizing parallelism. A common approach towards solving...