This paper presents hardware and software mechanisms to enable concurrent direct network access (CDNA) by operating systems running within a virtual machine monitor. In a conventi...
Jeffrey Shafer, David Carr, Aravind Menon, Scott R...
Most previous 3D IC research focused on “stacking” traditional 2D silicon layers, so the interconnect reduction is limited to interblock delays. In this paper, we propose tech...
Yongxiang Liu, Yuchun Ma, Eren Kursun, Glenn Reinm...
Most well-performing load value predictors are hybrids that combine multiple predictors into one. Such hybrids are often large. To reduce their size and to improve their performan...
Numerous optimizations exist for improving the performance of nondeferred reference-counting (RC) garbage collection. Their designs are ad hoc, intended to exploit different count...
Chip multiprocessors (CMPs) share a large portion of the memory subsystem among multiple cores. Recent proposals have addressed high-performance and fair management of these share...