Sciweavers

5588 search results - page 344 / 1118
» Parallel Implementation of Bags
Sort
View
CLUSTER
2009
IEEE
15 years 11 months ago
Combining I/O operations for multiple array variables in parallel netCDF
—Parallel netCDF (PnetCDF) is a popular library used in many scientific applications to store scientific datasets. It provides high-performance parallel I/O while maintaining ...
Kui Gao, Wei-keng Liao, Alok N. Choudhary, Robert ...
176
Voted
IPPS
2009
IEEE
15 years 11 months ago
Singular value decomposition on GPU using CUDA
Linear algebra algorithms are fundamental to many computing applications. Modern GPUs are suited for many general purpose processing tasks and have emerged as inexpensive high per...
Sheetal Lahabar, P. J. Narayanan
144
Voted
IPPS
2010
IEEE
15 years 2 months ago
Executing task graphs using work-stealing
Abstract--NABBIT is a work-stealing library for execution of task graphs with arbitrary dependencies which is implemented as a library for the multithreaded programming language Ci...
Kunal Agrawal, Charles E. Leiserson, Jim Sukha
129
Voted
ISPDC
2008
IEEE
15 years 11 months ago
Scalable Dense Factorizations for Heterogeneous Computational Clusters
This paper discusses the design and the implementation of the LU factorization routines included in the Heterogeneous ScaLAPACK library, which is built on top of ScaLAPACK. These ...
Ravi Reddy Manumachu, Alexey L. Lastovetsky, Pedro...
DSD
2007
IEEE
133views Hardware» more  DSD 2007»
15 years 11 months ago
A Serial Logarithmic Number System ALU
Serial arithmetic uses less hardware than parallel arithmetic. Serial floating point (FP) is slower than parallel FP. The Logarithmic Number System (LNS) simplifies operations, ...
Mark G. Arnold, Panagiotis D. Vouzis