Sciweavers

5640 search results - page 494 / 1128
» Parallelizing the Data Cube
Sort
View
ICASSP
2011
IEEE
14 years 8 months ago
Real-time DVB-S2 LDPC decoding on many-core GPU accelerators
It is well known that LDPC decoding is computationally demanding and one of the hardest signal operations to parallelize. Beyond data dependencies that restrict the decoding of a ...
Gabriel Falcão Paiva Fernandes, Joao Andrad...
ARC
2012
Springer
317views Hardware» more  ARC 2012»
14 years 22 days ago
A High Throughput FPGA-Based Implementation of the Lanczos Method for the Symmetric Extremal Eigenvalue Problem
Iterative numerical algorithms with high memory bandwidth requirements but medium-size data sets (matrix size ∼ a few 100s) are highly appropriate for FPGA acceleration. This pap...
Abid Rafique, Nachiket Kapre, George A. Constantin...
191
Voted
ICDCN
2012
Springer
14 years 17 days ago
Lifting the Barriers - Reducing Latencies with Transparent Transactional Memory
Synchronization in distributed systems is expensive because, in general, threads must stall to obtain a lock or to operate on volatile data. Transactional memory, on the other hand...
Annette Bieniusa, Thomas Fuhrmann
PDP
2010
IEEE
15 years 10 months ago
Load Balancing Algorithms with Partial Information Management for the DLML Library
Abstract—Load balancing algorithms are an essential component of parallel computing reducing the response time of applications. Frequently, balancing algorithms have a centralize...
Juan Santana-Santana, Miguel A. Castro-Garcí...
156
Voted
HPDC
2010
IEEE
15 years 6 months ago
New caching techniques for web search engines
This paper proposes a cache hierarchy that enables Web search engines to efficiently process user queries. The different caches in the hierarchy are used to store pieces of data w...
Mauricio Marín, Veronica Gil Costa, Carlos ...