Sciweavers

140 search results - page 28 / 28
» Profiling and mapping of parallel workloads on network proce...
Sort
View
EUROPAR
2010
Springer
13 years 5 months ago
Optimized On-Chip-Pipelined Mergesort on the Cell/B.E
Abstract. Limited bandwidth to off-chip main memory is a performance bottleneck in chip multiprocessors for streaming computations, such as Cell/B.E., and this will become even mor...
Rikard Hultén, Christoph W. Kessler, Jö...
HPCA
2002
IEEE
14 years 5 months ago
CableS: Thread Control and Memory Management Extensions for Shared Virtual Memory Clusters
Clusters of high-end workstations and PCs are currently used in many application domains to perform large-scale computations or as scalable servers for I/O bound tasks. Although c...
Peter Jamieson, Angelos Bilas
ISCA
2002
IEEE
174views Hardware» more  ISCA 2002»
13 years 4 months ago
Efficient Task Partitioning Algorithms for Distributed Shared Memory Systems
In this paper, we consider the tree task graphs which arise from many important programming paradigms such as divide and conquer, branch and bound etc., and the linear task-graphs...
Sibabrata Ray, Hong Jiang
ISLPED
2005
ACM
93views Hardware» more  ISLPED 2005»
13 years 10 months ago
Power-aware code scheduling for clusters of active disks
In this paper, we take the idea of application-level processing on disks to one level further, and focus on an architecture, called Cluster of Active Disks (CAD), where the storag...
Seung Woo Son, Guangyu Chen, Mahmut T. Kandemir
SC
2005
ACM
13 years 10 months ago
Tera-Scalable Algorithms for Variable-Density Elliptic Hydrodynamics with Spectral Accuracy
We describe Miranda, a massively parallel spectral/compact solver for variabledensity incompressible flow, including viscosity and species diffusivity effects. Miranda utilizes...
Andrew W. Cook, William H. Cabot, Peter L. William...