The present contribution explores the design space for virtual channel (VC) and switch allocators in network-on-chip (NoC) routers. Based on detailed RTL-level implementations, we...
Irregular and dynamic parallel applications pose significant challenges to achieving scalable performance on large-scale multicore clusters. These applications often require ongo...
James Dinan, D. Brian Larkins, P. Sadayappan, Srir...
Memory bandwidth limits the performance of important kernels in many scientific applications. Such applications often use sequences of Basic Linear Algebra Subprograms (BLAS), an...
Geoffrey Belter, Elizabeth R. Jessup, Ian Karlin, ...
Existing cache partitioning schemes are designed in a manner oblivious to the implicit processor partitioning enforced by the operating system. This paper examines an operating sy...
Shekhar Srikantaiah, Reetuparna Das, Asit K. Mishr...
Proponents of utility-based scheduling policies have shown the potential for a 100–1400% increase in value-delivered to users when used in lieu of traditional approaches such as...