Packet processing systems maintain high throughput despite relatively high memory latencies by exploiting the coarse-grained parallelism available between packets. In particular, ...
We present the architecture and practical VLSI implementation of a 4-Tb/s single-stage switch. It is based on a combined input- and crosspoint-queued structure with virtual output...
We show how the traditional protocol stack, such as TCP/IP, can be eliminated for socket based high speed communication within a cluster. The SCI shared memory interconnect is used...
To sustain instruction throughput rates in more aggressively clocked microarchitectures, microarchitects have incorporated larger and more complex branch predictors into their des...
Abstract— Software routers are becoming an important alternative to proprietary and expensive network devices, because they exploit the economy of scale of the PC market and open...
Andrea Bianco, Jorge M. Finochietto, Giulio Galant...