It is well recognized that LRU cache-line replacement can be ineffective for applications with large working sets or non-localized memory access patterns. Specifically, in lastle...
Optimizing the common case has been an adage in decades of processor design practices. However, as the system complexity and optimization techniques’ sophistication have increas...
Processors progressively age during their service life due to normal workload activity. Such aging results in gradually slower circuits. Anticipating this fact, designers add timi...
—Transactional memory (TM) is a promising paradigm for helping programmers take advantage of emerging multicore platforms. Though they perform well under low contention, hardware...
Hany E. Ramadan, Christopher J. Rossbach, Emmett W...
Uniprocessor simulators track resource utilization cycle by cycle to estimate performance. Multiprocessor simulators, however, must account for synchronization events that increas...
Benjamin C. Lee, Jamison D. Collins, Hong Wang 000...