Sciweavers

239 search results - page 1 / 48
» A Performance Evaluation of the Convex SPP-1000 Scalable Sha...
Sort
View
SC
1995
ACM
13 years 8 months ago
A Performance Evaluation of the Convex SPP-1000 Scalable Shared Memory Parallel Computer
The Convex SPP-1000 is the first commercial implementation of a new generation of scalable shared memory parallel computers with full cache coherence. It employs a hierarchical s...
Thomas L. Sterling, Daniel Savarese, Peter MacNeic...
ICPP
1996
IEEE
13 years 8 months ago
Scheduling of Wavefront Parallelism on Scalable Shared-memory Multiprocessors
Tiling exploits temporal reuse carried by an outer loop of a loop nest to enhance cache locality. Loop skewing is typically required to make tiling legal. This restricts parallelis...
Naraig Manjikian, Tarek S. Abdelrahman
ICPP
1995
IEEE
13 years 8 months ago
Fusion of Loops for Parallelism and Locality
Loop fusion improves data locality and reduces synchronization in data-parallel applications. However, loop fusion is not always legal. Even when legal, fusion may introduce loop-...
Naraig Manjikian, Tarek S. Abdelrahman
HIPC
2007
Springer
13 years 10 months ago
Direct Coherence: Bringing Together Performance and Scalability in Shared-Memory Multiprocessors
Traditional directory-based cache coherence protocols suffer from long-latency cache misses as a consequence of the indirection introduced by the home node, which must be accessed...
Alberto Ros, Manuel E. Acacio, José M. Garc...
EUROPAR
2005
Springer
13 years 10 months ago
A Novel Lightweight Directory Architecture for Scalable Shared-Memory Multiprocessors
There are two important hurdles that restrict the scalability of directory-based shared-memory multiprocessors: the directory memory overhead and the long L2 miss latencies due to ...
Alberto Ros, Manuel E. Acacio, José M. Garc...