Complexity effective memory access scheduling for many-core accelerator architectures

13 years 11 months ago

Download www.ece.ubc.ca

Modern DRAM systems rely on memory controllers that employ out-of-order scheduling to maximize row access locality and bank-level parallelism, which in turn maximizes DRAM bandwidth. This is especially important in graphics processing unit (GPU) architectures, where the large quantity of parallelism places a heavy demand on the memory system. The logic needed for out-of-order scheduling can be expensive in terms of area, especially when compared to an in-order scheduling approach. In this paper, we propose a complexity-eﬀective solution to DRAM request scheduling which recovers most of the performance loss incurred by a naive in-order ﬁrst-in ﬁrst-out (FIFO) DRAM scheduler compared to an aggressive out-of-order DRAM scheduler. We observe that the memory request stream from individual GPU“shader cores”tends to have suﬃcient row access locality to maximize DRAM eﬃciency in most applications without signiﬁcant reordering. However, the interconnection network across which ...

George L. Yuan, Ali Bakhoda, Tor M. Aamodt

Real-time Traffic