Map- and fold-like skeletons are a suitable abstractions to guide parallel program execution in functional array processing. However, when it comes to achieving high performance, i...
Abstract--Multi-core processors with accelerators are becoming commodity components for high-performance computing at scale. While accelerator-based processors have been studied in...
M. Mustafa Rafique, Ali Raza Butt, Dimitrios S. Ni...
Performance of multithreaded programs is heavily influenced by the latencies of the thread management and synchronization operations. Improving these latencies becomes especially...
Alex Gontmakher, Avi Mendelson, Assaf Schuster, Gr...
The emergence of grid and a new class of data-driven applications is making a new form of parallelism desirable, which we refer to as coarse-grained pipelined parallelism. This pa...
Although chip-multiprocessors have become the industry standard, developing parallel applications that target them remains a daunting task. Non-determinism, inherent in threaded a...
Marek Olszewski, Jason Ansel, Saman P. Amarasinghe