good data partitioning scheme is the need of the time. However it is very diflcult to arrive at a good solution as the number of possible dutupartitionsfor a given real lifeprogra...
The problem of writing high performance parallel applications becomes even more challenging when irregular, sparse or adaptive methods are employed. In this paper we introduce com...
Effective use of the memory hierarchy is critical for achieving high performance on embedded systems. We focus on the class of streaming applications, which is increasingly preval...
Janis Sermulins, William Thies, Rodric M. Rabbah, ...
Single-threaded programming is already considered a complicated task. The move to multi-threaded programming only increases the complexity and cost involved in software developmen...
Matthew J. Bridges, Neil Vachharajani, Yun Zhang, ...
Recent innovations in automatic parallelizing compilers are showing impressive speedups on multicore processors using shared memory with asynchronous channels. We have formulated a...