Data communications between producer instructions and consumer instructions through memory incur extra delays that degrade processor performance. In this paper, we introduce a new...
— One of the critical goals in code optimization for MPSoC architectures is to minimize the number of off-chip memory accesses. This is because such accesses can be extremely cos...
Parallel programs that modify shared data in a cachecoherent multiprocessor with a write-invalidate coherence protocol create ownership overhead in the form of ownership acquisiti...
We investigate the implementation of IP look-up for core routers using multiple microengines and a tailored memory hierarchy. The main architectural concerns are limiting the numb...
Jean-Loup Baer, Douglas Low, Patrick Crowley, Neal...
Post-silicon processor debugging is frequently carried out in a loop consisting of several iterations of the following two key steps: (i) processor execution for some duration, fo...
Anant Vishnoi, Preeti Ranjan Panda, M. Balakrishna...