We present the architecture and practical VLSI implementation of a 4-Tb/s single-stage switch. It is based on a combined input- and crosspoint-queued structure with virtual output...
We present a generic module, called Fast Collect. Fast Collect is an implementation of Single-Writer Multi-Reader (SWMR) Shared-Memory in an asynchronous system in which a process...
We show how the traditional protocol stack, such as TCP/IP, can be eliminated for socket based high speed communication within a cluster. The SCI shared memory interconnect is used...
Stand-alone threading libraries lack sophisticated memory management techniques. In this paper, we present a methodology that allows threading libraries that implement non-preempti...
ABSTRACT. Reconfigurable systolic arrays can be adapted to efficiently resolve a wide spectrum of computational problems; parallelism is naturally explored in systolic arrays and r...