OpenMP relies heavily on barrier synchronization to coordinate the work of threads that are performing the computations in a parallel region. A good implementation of barriers is ...
Ramachandra C. Nanjegowda, Oscar Hernandez, Barbar...
OpenMP is an architecture-independent language for programming in the shared memory model. OpenMP is designed to be simple and in terms of programming abstractions. Unfortunately,...
A reduction is a computation in which a common operation, such as a sum, is to be performed across multiple pieces of data, each supplied by a separate task. We introduce phaser a...
Jun Shirako, David M. Peixotto, Vivek Sarkar, Will...
This paper presents an algorithm and a data structure for scalable dynamic synchronization in fine-grained parallelism. The algorithm supports the full generality of phasers with d...
Stefan Marr, Stijn Verhaegen, Bruno De Fraine, The...
Interconnects based on Networks-on-Chip are an appealing solution to address future microprocessor designs where, very likely, hundreds of cores will be connected on a single chip...