Sciweavers

118 search results - page 17 / 24
» Communication and Memory Optimal Parallel Data Cube Construc...
Sort
View
LCPC
2007
Springer
15 years 4 months ago
A Novel Asynchronous Software Cache Implementation for the Cell-BE Processor
This paper describes the implementation of a runtime library for asynchronous communication in the Cell BE processor. The runtime library implementation provides with several servi...
Jairo Balart, Marc González, Xavier Martore...
LPNMR
2009
Springer
15 years 4 months ago
Application of ASP for Automatic Synthesis of Flexible Multiprocessor Systems from Parallel Programs
Configurable on chip multiprocessor systems combine advantages of task-level parallelism and the flexibility of field-programmable devices to customize architectures for paralle...
Harold Ishebabi, Philipp Mahr, Christophe Bobda, M...
CORR
2006
Springer
142views Education» more  CORR 2006»
14 years 10 months ago
Decentralized Erasure Codes for Distributed Networked Storage
We consider the problem of constructing an erasure code for storage over a network when the data sources are distributed. Specifically, we assume that there are n storage nodes wit...
Alexandros G. Dimakis, Vinod M. Prabhakaran, Kanna...
ISCA
2011
IEEE
238views Hardware» more  ISCA 2011»
14 years 1 months ago
Rebound: scalable checkpointing for coherent shared memory
As we move to large manycores, the hardware-based global checkpointing schemes that have been proposed for small shared-memory machines do not scale. Scalability barriers include ...
Rishi Agarwal, Pranav Garg, Josep Torrellas
ICS
2010
Tsinghua U.
15 years 2 months ago
Large-scale FFT on GPU clusters
A GPU cluster is a cluster equipped with GPU devices. Excellent acceleration is achievable for computation-intensive tasks (e.g. matrix multiplication and LINPACK) and bandwidth-i...
Yifeng Chen, Xiang Cui, Hong Mei