Sciweavers

6897 search results - page 1203 / 1380
» Parallelization of Modular Algorithms
Sort
View
EUROPAR
2001
Springer
15 years 9 months ago
Performance of High-Accuracy PDE Solvers on a Self-Optimizing NUMA Architecture
High-accuracy PDE solvers use multi-dimensional fast Fourier transforms. The FFTs exhibits a static and structured memory access pattern which results in a large amount of communic...
Sverker Holmgren, Dan Wallin
FCCM
2000
IEEE
131views VLSI» more  FCCM 2000»
15 years 9 months ago
A Reliable LZ Data Compressor on Reconfigurable Coprocessors
Data compression techniques based on Lempel-Ziv (LZ) algorithm are widely used in a variety of applications, especially in data storage and communications. However, since the LZ a...
Wei-Je Huang, Nirmal R. Saxena, Edward J. McCluske...
ASPLOS
2010
ACM
15 years 9 months ago
Accelerating MATLAB Image Processing Toolbox functions on GPUs
In this paper, we present our effort in developing an opensource GPU (graphics processing units) code library for the MATLAB Image Processing Toolbox (IPT). We ported a dozen of r...
Jingfei Kong, Martin Dimitrov, Yi Yang, Janaka Liy...
135
Voted
INFOCOM
1999
IEEE
15 years 9 months ago
High Performance IP Routing Table Lookup using CPU Caching
Wire-speed IP (Internet Protocol) routers require very fast routing table lookup for incoming IP packets. The routing table lookup operation is time consuming because the part of ...
Tzi-cker Chiueh, Prashant Pradhan
MICRO
1999
IEEE
110views Hardware» more  MICRO 1999»
15 years 9 months ago
Balance Scheduling: Weighting Branch Tradeoffs in Superblocks
Since there is generally insufficient instruction level parallelism within a single basic block, higher performance is achieved by speculatively scheduling operations in superbloc...
Alexandre E. Eichenberger, Waleed Meleis
« Prev « First page 1203 / 1380 Last » Next »