Sciweavers

PLDI
2012
ACM
11 years 7 months ago
Parcae: a system for flexible parallel execution
Workload, platform, and available resources constitute a parallel program’s execution environment. Most parallelization efforts statically target an anticipated range of environ...
Arun Raman, Ayal Zaks, Jae W. Lee, David I. August
ARC
2012
Springer
317views Hardware» more  ARC 2012»
12 years 8 days ago
A High Throughput FPGA-Based Implementation of the Lanczos Method for the Symmetric Extremal Eigenvalue Problem
Iterative numerical algorithms with high memory bandwidth requirements but medium-size data sets (matrix size ∼ a few 100s) are highly appropriate for FPGA acceleration. This pap...
Abid Rafique, Nachiket Kapre, George A. Constantin...
CGF
2011
12 years 11 months ago
A Parallel SPH Implementation on Multi-Core CPUs
This paper presents a parallel framework for simulating fluids with the Smoothed Particle Hydrodynamics (SPH) method. For low computational costs per simulation step, efficient ...
Markus Ihmsen, Nadir Akinci, Markus Becker, Matthi...
SIGMETRICS
2011
ACM
196views Hardware» more  SIGMETRICS 2011»
12 years 11 months ago
Performance analysis of the OP2 framework on many-core architectures
We present a performance analysis and benchmarking study P2 “active” library, which provides an abstraction framework for the solution of parallel unstructured mesh applicatio...
M. B. Giles, Gihan R. Mudalige, Z. Sharif, Graham ...
CSC
2010
13 years 2 months ago
An Evaluation of Parallel Knapsack Algorithms on Multicore Architectures
Emergence of chip multiprocessor systems has dramatically increased the performance potential of computer systems. Since the amount of exploited parallelism is directly influenced ...
Hammad Rashid, Clara Novoa, Apan Qasem
JGO
2010
89views more  JGO 2010»
13 years 3 months ago
Iterative regularization algorithms for constrained image deblurring on graphics processors
Abstract The ability of the modern graphics processors to operate on large matrices in parallel can be exploited for solving constrained image deblurring problems in a short time. ...
Valeria Ruggiero, Thomas Serafini, Riccardo Zanell...
ICPPW
2002
IEEE
13 years 9 months ago
Parallel Cholesky Factorization of a Block Tridiagonal Matrix
In this paper we discuss the parallel implementation of the Cholesky factorization of a positive definite symmetric matrix when that matrix is block tridiagonal. While parallel im...
Thuan D. Cao, John F. Hall, Robert A. van de Geijn
SBACPAD
2003
IEEE
103views Hardware» more  SBACPAD 2003»
13 years 9 months ago
Performance Analysis Issues for Parallel Implementations of Propagation Algorithm
This paper presents a theoretical study to evaluate the performance of a family of parallel implementations of the propagation algorithm. The propagation algorithm is used to an i...
Leonardo Brenner, Luiz Gustavo Fernandes, Paulo Fe...
IPPS
2003
IEEE
13 years 9 months ago
Parallel Direct Solution of Linear Equations on FPGA-Based Machines
The efficient solution of large systems of linear equations represented by sparse matrices appears in many tasks. LU factorization followed by backward and forward substitutions i...
Xiaofang Wang, Sotirios G. Ziavras
IPPS
2003
IEEE
13 years 9 months ago
Performing DNA Comparison on a Bio-Inspired Tissue of FPGAs
String comparison is a critical issue in many application domains, including speech recognition, contents search, and bioinformatics. The similarity between two strings of lengths...
Matteo Canella, Filippo Miglioli, Alessandro Bogli...