Sciweavers

201 search results - page 32 / 41
» Estimating the Parallel Start-Up Overhead for Parallelizing ...
Sort
View
92
Voted
PPOPP
2009
ACM
16 years 5 days ago
A compiler-directed data prefetching scheme for chip multiprocessors
Data prefetching has been widely used in the past as a technique for hiding memory access latencies. However, data prefetching in multi-threaded applications running on chip multi...
Dhruva Chakrabarti, Mahmut T. Kandemir, Mustafa Ka...
108
Voted
ISHPC
2000
Springer
15 years 3 months ago
Leveraging Transparent Data Distribution in OpenMP via User-Level Dynamic Page Migration
This paper describes transparent mechanisms for emulating some of the data distribution facilities offered by traditional data-parallel programming models, such as High Performance...
Dimitrios S. Nikolopoulos, Theodore S. Papatheodor...
IEEEPACT
2005
IEEE
15 years 5 months ago
An Event-Driven Multithreaded Dynamic Optimization Framework
Dynamic optimization has the potential to adapt the program’s behavior at run-time to deliver performance improvements over static optimization. Dynamic optimization systems usu...
Weifeng Zhang, Brad Calder, Dean M. Tullsen
CASES
2009
ACM
15 years 3 months ago
Exploiting residue number system for power-efficient digital signal processing in embedded processors
2's complement number system imposes a fundamental limitation on the power and performance of arithmetic circuits, due to the fundamental need of cross-datapath carry propaga...
Rooju Chokshi, Krzysztof S. Berezowski, Aviral Shr...
IWOMP
2009
Springer
15 years 6 months ago
Scalability Evaluation of Barrier Algorithms for OpenMP
OpenMP relies heavily on barrier synchronization to coordinate the work of threads that are performing the computations in a parallel region. A good implementation of barriers is ...
Ramachandra C. Nanjegowda, Oscar Hernandez, Barbar...