Sciweavers

5553 search results - page 445 / 1111
» Parallel Implementation of Sch
Sort
View
CLUSTER
2003
IEEE
15 years 11 months ago
Improving the Performance of MPI Derived Datatypes by Optimizing Memory-Access Cost
The MPI Standard supports derived datatypes, which allow users to describe noncontiguous memory layout and communicate noncontiguous data with a single communication function. Thi...
Surendra Byna, William D. Gropp, Xian-He Sun, Raje...
ISCA
2000
IEEE
78views Hardware» more  ISCA 2000»
15 years 10 months ago
Vector instruction set support for conditional operations
Vector instruction sets are receiving renewed interest because of their applicability to multimedia. Current multimedia instruction sets use short vectors with SIMD implementation...
James E. Smith, Greg Faanes, Rabin A. Sugumar
PC
2010
145views Management» more  PC 2010»
15 years 4 months ago
GPU computing with Kaczmarz's and other iterative algorithms for linear systems
The graphics processing unit (GPU) is used to solve large linear systems derived from partial differential equations. The differential equations studied are strongly convection-...
Joseph M. Elble, Nikolaos V. Sahinidis, Panagiotis...
ICS
2009
Tsinghua U.
16 years 1 months ago
Tuned and wildly asynchronous stencil kernels for hybrid CPU/GPU systems
We describe heterogeneous multi-CPU and multi-GPU implementations of Jacobi’s iterative method for the 2-D Poisson equation on a structured grid, in both single- and doublepreci...
Sundaresan Venkatasubramanian, Richard W. Vuduc
IPPS
2009
IEEE
16 years 29 days ago
Crash fault detection in celerating environments
Failure detectors are a service that provides (approximate) information about process crashes in a distributed system. The well-known “eventually perfect” failure detector, 3P...
Srikanth Sastry, Scott M. Pike, Jennifer L. Welch