Programming specialized network processors (NPU) is inherently difficult. Unlike mainstream processors where architectural features such as out-of-order execution and caches hide ...
Parallel programs that modify shared data in a cachecoherent multiprocessor with a write-invalidate coherence protocol create ownership overhead in the form of ownership acquisiti...
Massively multicore processors, like, for example, Graphics Processing Units (GPUs), provide, at a comparable price, a one order of magnitude higher peak performance than traditio...
Abdullah Gharaibeh, Samer Al-Kiswany, Sathish Gopa...
Many current programmable architectures designed to exploit data parallelism require computation to be structured to operate on sequentially accessed vectors or streams of data. A...
Nuwan Jayasena, Mattan Erez, Jung Ho Ahn, William ...
We investigate the implementation of IP look-up for core routers using multiple microengines and a tailored memory hierarchy. The main architectural concerns are limiting the numb...
Jean-Loup Baer, Douglas Low, Patrick Crowley, Neal...