Sciweavers

321 search results - page 27 / 65
» Implementing Randomized Matrix Algorithms in Parallel and Di...
Sort
View
PPOPP
2010
ACM
15 years 11 months ago
Scaling LAPACK panel operations using parallel cache assignment
In LAPACK many matrix operations are cast as block algorithms which iteratively process a panel using an unblocked algorithm and then update a remainder matrix using the high perf...
Anthony M. Castaldo, R. Clint Whaley
ICPP
1993
IEEE
15 years 6 months ago
Dependence Analysis and Architecture Design for Bit-Level Algorithms
:. In designing application-specific bit-level architectures and in programming existing bit-level processor arrays, it is necessary to expand a word-level algorithm into its bit-...
Weijia Shang, Benjamin W. Wah
LCPC
2007
Springer
15 years 8 months ago
A Novel Asynchronous Software Cache Implementation for the Cell-BE Processor
This paper describes the implementation of a runtime library for asynchronous communication in the Cell BE processor. The runtime library implementation provides with several servi...
Jairo Balart, Marc González, Xavier Martore...
GRID
2006
Springer
15 years 2 months ago
A Decentralized Computational Infrastructure for Grid-Based Parallel Asynchronous Iterative Applications
Abstract Parallel asynchronous iterative algorithms relax synchronization and communication requirements, and can potentially extend Desktop Grids beyond embarrassingly parallel ap...
Zhen Li, Manish Parashar
FMCAD
2000
Springer
15 years 6 months ago
Scalable Distributed On-the-Fly Symbolic Model Checking
Abstract. This paper presents a scalable method for parallel symbolic on-the-fly model checking in a distributed memory environment. Our method combines a scheme for on-the-fly mod...
Shoham Ben-David, Tamir Heyman, Orna Grumberg, Ass...