High-accuracy PDE solvers use multi-dimensional fast Fourier transforms. The FFTs exhibits a static and structured memory access pattern which results in a large amount of communic...
Data compression techniques based on Lempel-Ziv (LZ) algorithm are widely used in a variety of applications, especially in data storage and communications. However, since the LZ a...
Wei-Je Huang, Nirmal R. Saxena, Edward J. McCluske...
In this paper, we present our effort in developing an opensource GPU (graphics processing units) code library for the MATLAB Image Processing Toolbox (IPT). We ported a dozen of r...
Jingfei Kong, Martin Dimitrov, Yi Yang, Janaka Liy...
Wire-speed IP (Internet Protocol) routers require very fast routing table lookup for incoming IP packets. The routing table lookup operation is time consuming because the part of ...
Since there is generally insufficient instruction level parallelism within a single basic block, higher performance is achieved by speculatively scheduling operations in superbloc...