In this paper we present an efficient algorithm for compile-time scheduling and clustering of parallel programs onto parallel processing systems with distributed memory, which is ...
—One of the main obstacles in obtaining high performance from message-passing multicomputer systems is the inevitable communication overhead which is incurred when tasks executin...
Many high-level parallel programming languages allow for fine-grained parallelism. As in the popular work-time framework for parallel algorithm design, programs written in such lan...
Helper locks allow programs with large parallel critical sections, called parallel regions, to execute more efficiently by enlisting processors that might otherwise be waiting on ...
We present an adaptive work-stealing thread scheduler, ASTEAL, for fork-join multithreaded jobs, like those written using the Cilk multithreaded language or the Hood work-stealing...
Kunal Agrawal, Charles E. Leiserson, Yuxiong He, W...