Sciweavers

CF
2006
ACM

Landing openMP on cyclops-64: an efficient mapping of openMP to a many-core system-on-a-chip

13 years 7 months ago
Landing openMP on cyclops-64: an efficient mapping of openMP to a many-core system-on-a-chip
This paper presents our experience mapping OpenMP parallel programming model to the IBM Cyclops-64 (C64) architecture. The C64 employs a many-core-on-a-chip design that integrates processing logic (160 thread units), embedded memory (5MB) and communication hardware on the same die. Such a unique architecture presents new opportunities for optimization. Specifically, we consider the following three areas: (1) a memory aware runtime library that places frequently used data structures in scratchpad memory; (2) a unique spin lock algorithm for shared memory synchronization based on in-memory atomic instructions and native support for thread level execution; (3) a fast barrier that directly uses C64 hardware support for collective synchronization. All three optimizations together, result in an 80% overhead reduction for language constructs in OpenMP. We believe that such a drastic reduction in the cost of managing parallelism makes OpenMP more amenable for writing parallel programs on the ...
Juan del Cuvillo, Weirong Zhu, Guang R. Gao
Added 20 Aug 2010
Updated 20 Aug 2010
Type Conference
Year 2006
Where CF
Authors Juan del Cuvillo, Weirong Zhu, Guang R. Gao
Comments (0)