Sciweavers

26 search results - page 5 / 6
» Using Embedded Network Processors to Implement Global Memory...
Sort
View
ICS
2010
Tsinghua U.
13 years 11 months ago
Large-scale FFT on GPU clusters
A GPU cluster is a cluster equipped with GPU devices. Excellent acceleration is achievable for computation-intensive tasks (e.g. matrix multiplication and LINPACK) and bandwidth-i...
Yifeng Chen, Xiang Cui, Hong Mei
CLUSTER
2008
IEEE
14 years 27 days ago
High message rate, NIC-based atomics: Design and performance considerations
—Remote atomic memory operations are critical for achieving high-performance synchronization in tightly-coupled systems. Previous approaches to implementing atomic memory operati...
Keith D. Underwood, Michael Levenhagen, K. Scott H...
IPPS
2000
IEEE
13 years 10 months ago
Thread Migration and Load Balancing in Non-Dedicated Environments
Networks of workstations are fast becoming the standard environment for parallel applications. However, the use of “found” resources as a platform for tightly-coupled runtime ...
Kritchalach Thitikamol, Peter J. Keleher
RTAS
2010
IEEE
13 years 4 months ago
Physicalnet: A Generic Framework for Managing and Programming Across Pervasive Computing Networks
This paper describes the design and implementation of a pervasive computing framework, named Physicalnet. Essentially, Physicalnet is a generic paradigm for managing and programmi...
Pascal Vicaire, Zhiheng Xie, Enamul Hoque, John A....
SC
1995
ACM
13 years 10 months ago
A Performance Evaluation of the Convex SPP-1000 Scalable Shared Memory Parallel Computer
The Convex SPP-1000 is the first commercial implementation of a new generation of scalable shared memory parallel computers with full cache coherence. It employs a hierarchical s...
Thomas L. Sterling, Daniel Savarese, Peter MacNeic...