Sciweavers

HPCA
2011
IEEE

Shared last-level TLBs for chip multiprocessors

12 years 8 months ago
Shared last-level TLBs for chip multiprocessors
Translation Lookaside Buffers (TLBs) are critical to processor performance. Much past research has addressed uniprocessor TLBs, lowering access times and miss rates. However, as chip multiprocessors (CMPs) become ubiquitous, TLB design must be re-evaluated. This paper is the first to propose and evaluate shared last-level (SLL) TLBs as an alternative to the commercial norm of private, per-core L2 TLBs. SLL TLBs eliminate 7-79% of system-wide misses for parallel workloads. This is an average of 27% better than conventional private, per-core L2 TLBs, translating to notable runtime gains. SLL TLBs also provide benefits comparable to recently-proposed Inter-Core Cooperative (ICC) TLB prefetchers, but with considerably simpler hardware. Furthermore, unlike these prefetchers, SLL TLBs can aid sequential applications, eliminating 35-95% of the TLB misses for various multiprogrammed combinations of sequential applications. This corresponds to a 21% average increase in TLB miss eliminations ...
Abhishek Bhattacharjee, Daniel Lustig, Margaret Ma
Added 20 Aug 2011
Updated 20 Aug 2011
Type Journal
Year 2011
Where HPCA
Authors Abhishek Bhattacharjee, Daniel Lustig, Margaret Martonosi
Comments (0)