139
Voted
HPCA
15 years 4 months ago
1995 IEEE
The throughput of a multiple-pipelined processor suffers due to lack of sufficient instructions to make multiple pipelines busy and due to delays associated with pipeline depende...
111
click to vote
HPCA
15 years 4 months ago
1995 IEEE
—Reducing communication latency, which is a performance bottleneck in optically interconnected multiprocessor systems, is of prominent importance. A conventional approach for est...
106
click to vote
HPCA
15 years 4 months ago
1995 IEEE
Shared memory is an appealing abstraction for parallel programming. It must be implemented with caches in order toperform well, however, and caches require a coherence mechanism t...
104
click to vote
HPCA
15 years 4 months ago
1995 IEEE
Information on the behavior of programs is essential for deciding the number and nature of functional units in high performance architectures. In this paper, we present studies on...
HPCA
15 years 4 months ago
1995 IEEE
In this paper we consider several hardware implementations of the general-purpose atomic primitives fetch and Φ, compare and swap, load linked, and store conditionalon large-scal...
|