Sciweavers

6939 search results - page 1192 / 1388
» Algorithm Engineering for Parallel Computation
Sort
View
120
Voted
ARC
2008
Springer
112views Hardware» more  ARC 2008»
15 years 5 months ago
Optimal Unroll Factor for Reconfigurable Architectures
Abstract. Loops are an important source of optimization. In this paper, we address such optimizations for those cases when loops contain kernels mapped on reconfigurable fabric. We...
Ozana Silvia Dragomir, Elena Moscu Panainte, Koen ...
145
Voted
NIPS
2008
15 years 5 months ago
Asynchronous Distributed Learning of Topic Models
Distributed learning is a problem of fundamental interest in machine learning and cognitive science. In this paper, we present asynchronous distributed learning algorithms for two...
Arthur Asuncion, Padhraic Smyth, Max Welling
WOTUG
2008
15 years 5 months ago
Process-Oriented Collective Operations
Abstract. Distributing process-oriented programs across a cluster of machines requires careful attention to the effects of network latency. The MPI standard, widely used for cluste...
John Markus Bjørndalen, Adam T. Sampson
CMPB
2010
96views more  CMPB 2010»
15 years 3 months ago
Towards real-time radiation therapy: GPU accelerated superposition/convolution
We demonstrate the use of highly parallel graphics processing units (GPUs) to accelerate the Superposition/Convolution (S/C) algorithm to interactive rates while reducing the numbe...
Robert Jacques, Russell Taylor, John Wong, Todd Mc...
135
Voted
FGCS
2006
119views more  FGCS 2006»
15 years 3 months ago
OpenMP versus MPI for PDE solvers based on regular sparse numerical operators
Tw o parallel programming models represented b y OpenMP and MPI are compared for PDE solvers based on regular sparse numerical operators. As a typical representative of such an app...
Markus Nordén, Sverker Holmgren, Michael Th...
« Prev « First page 1192 / 1388 Last » Next »