Sciweavers

3321 search results - page 512 / 665
» Performance of parallel computations with dynamic processor ...
Sort
View
119
Voted
IPPS
2000
IEEE
15 years 8 months ago
Using Switch Directories to Speed Up Cache-to-Cache Transfers in CC-NUMA Multiprocessors
In this paper, we propose a novel hardware caching technique, called switch directory, to reduce the communication latency in CC-NUMA multiprocessors. The main idea is to implemen...
Ravi R. Iyer, Laxmi N. Bhuyan, Ashwini K. Nanda
HPCA
1998
IEEE
15 years 8 months ago
The Sensitivity of Communication Mechanisms to Bandwidth and Latency
The goal of this paper is to gain insight into the relative performance of communication mechanisms as bisection bandwidth and network latency vary. We compare shared memory with ...
Frederic T. Chong, Rajeev Barua, Fredrik Dahlgren,...
ICS
2009
Tsinghua U.
15 years 1 months ago
Creating artificial global history to improve branch prediction accuracy
Modern processors require highly accurate branch prediction for good performance. As such, a number of branch predictors have been proposed with varying size and complexity. This ...
Leo Porter, Dean M. Tullsen
ICS
2004
Tsinghua U.
15 years 9 months ago
Scaling the issue window with look-ahead latency prediction
In contemporary out-of-order superscalar design, high IPC is mainly achieved by exposing high instruction level parallelism (ILP). Scaling issue window size can certainly provide ...
Yongxiang Liu, Anahita Shayesteh, Gokhan Memik, Gl...
PVM
1998
Springer
15 years 8 months ago
SKaMPI: A Detailed, Accurate MPI Benchmark
Abstract. SKaMPI is a benchmark for MPI implementations. Its purpose is the detailed analysis of the runtime of individual MPI operations and comparison of these for di erent imple...
Ralf Reussner, Peter Sanders, Lutz Prechelt, Matth...