This paper presents an implementation of a high-performance network application layer parser in FPGAs. At the core of the architecture resides a pattern matcher and a parser. The ...
Current superscalar processors, both RISC and CISC, require substantial instruction fetch and decode bandwidth to keep multiple functional units utilized. While CISC instructions ...
The Convex SPP-1000 is the first commercial implementation of a new generation of scalable shared memory parallel computers with full cache coherence. It employs a hierarchical s...
Thomas L. Sterling, Daniel Savarese, Peter MacNeic...
Sparse matrix-vector multiplication forms the heart of iterative linear solvers used widely in scientific computations (e.g., finite element methods). In such solvers, the matrix-v...
Commercial designs are integrating from 10 to 100 embedded functional and storage blocks in a single system on chip (SoC) currently, and the number is likely to increase significa...