A large and increasing gap exists between processor and memory speeds in scalable cache-coherent multiprocessors. To cope with this situation, programmers and compiler writers mus...
Field Programmable Gate Arrays (FPGAs) have long held the promise of allowing designers to create systems with performance levels close to custom circuits but with a softwarelike ...
Abstract. In Reconfigurable Systems-On-Chip (RSoCs), operating systems can primarily (1) manage the sharing of limited reconfigurable resources, and (2) support communication betwe...
The Message Passing Interface (MPI) is one of the most widely used programming models for parallel computing. However, the amount of memory available to an MPI process is limited ...
James Dinan, Pavan Balaji, Ewing L. Lusk, P. Saday...
The most intuitive memory model for shared-memory multithreaded programming is sequential consistency (SC), but it disallows the use of many compiler and hardware optimizations th...
Daniel Marino, Abhayendra Singh, Todd D. Millstein...