Sciweavers

204 search results - page 9 / 41
» Combining Static Partitioning with Dynamic Distribution of T...
Sort
View
CASES
2006
ACM
15 years 1 months ago
Reaching fast code faster: using modeling for efficient software thread integration on a VLIW DSP
When integrating software threads together to boost performance on a processor with instruction-level parallel processing support, it is rarely clear which code regions should be ...
Won So, Alexander G. Dean
IWOMP
2009
Springer
15 years 2 months ago
Dynamic Task and Data Placement over NUMA Architectures: An OpenMP Runtime Perspective
Abstract. Exploiting the full computational power of current hierarchical multiprocessor machines requires a very careful distribution of threads and data among the underlying non-...
François Broquedis, Nathalie Furmento, Bric...
108
Voted
HPCA
2012
IEEE
13 years 5 months ago
Balancing DRAM locality and parallelism in shared memory CMP systems
Modern memory systems rely on spatial locality to provide high bandwidth while minimizing memory device power and cost. The trend of increasing the number of cores that share memo...
Min Kyu Jeong, Doe Hyun Yoon, Dam Sunwoo, Mike Sul...
IJPP
2010
156views more  IJPP 2010»
14 years 6 months ago
ForestGOMP: An Efficient OpenMP Environment for NUMA Architectures
Exploiting the full computational power of current hierarchical multiprocessor machines requires a very careful distribution of threads and data among the underlying non-uniform ar...
François Broquedis, Nathalie Furmento, Bric...
SC
2003
ACM
15 years 2 months ago
Traffic-based Load Balance for Scalable Network Emulation
Load balance is critical to achieving scalability for large network emulation studies, which are of compelling interest for emerging Grid, Peer to Peer, and other distributed appl...
Xin Liu, Andrew A. Chien