Most modern Chip Multiprocessors (CMP) feature shared cache on chip. For multithreaded applications, the sharing reduces communication latency among co-running threads, but also r...
The expanding gap between microprocessor and disk performance has initiated new techniques of providing memory as a service in high-end computing (HEC). Although the processor and...
Blocked-execution multiprocessor architectures continuously run atomic blocks of instructions — also called Chunks. Such architectures can boost both performance and software pr...
—Many application-specific architectures provide indirect addressing modes with auto-increment/decrement arithmetic. Since these architectures generally do not feature an indexe...
Set-associative caches are traditionally managed using hardwarebased lookup and replacement schemes that have high energy overheads. Ideally, the caching strategy should be tailor...
Rajiv A. Ravindran, Michael L. Chu, Scott A. Mahlk...