: Three fast mutual exclusion algorithms using read-modify-write and atomic read/write registers are presented in a sequence, with an improvement from one to the next. The last alg...
To keep up with a large degree of instruction level parallelism (ILP), the Itanium 2 cache systems use a complex organization scheme: load/store queues, banking and interleaving. ...
William Jalby, Christophe Lemuet, Sid Ahmed Ali To...
Divide and conquer algorithms are a good match for modern parallel machines: they tend to have large amounts of inherent parallelism and they work well with caches and deep memory...
Modern memory systems rely on spatial locality to provide high bandwidth while minimizing memory device power and cost. The trend of increasing the number of cores that share memo...
Min Kyu Jeong, Doe Hyun Yoon, Dam Sunwoo, Mike Sul...
Abstract. Memory accesses contribute sunstantially to aggregate system delays. It is critical for designers to ensure that the memory subsystem is designed efficiently, and much wo...
Su-Shin Ang, George A. Constantinides, Peter Y. K....