Load balancing and data locality are the two most important factors in the performance of parallel programs on distributed-memory multiprocessors. A good balancing scheme should e...
We present a highly-scalable non-blocking producer-consumer task pool, designed with a special emphasis on lightweight synchronization and data locality. The core building block o...
Elad Gidron, Idit Keidar, Dmitri Perelman, Yonatha...
This paper presents provable work-optimal parallelizations of STL (Standard Template Library) algorithms based on the workstealing technique. Unlike previous approaches where a deq...
This paper presents a new version of the OMPi OpenMP C compiler, enhanced by lightweight runtime support based on user-level multithreading. A large number of threads can be spawne...
Panagiotis E. Hadjidoukas, Vassilios V. Dimakopoul...