Hard-to-predict branches depending on longlatency cache-misses have been recognized as a major performance obstacle for modern microprocessors. With the widening speed gap between...
Hongliang Gao, Yi Ma, Martin Dimitrov, Huiyang Zho...
On Chip Multiprocessors (CMP), it is common that multiple cores share certain levels of cache. The sharing increases the contention in cache and memory-to-chip bandwidth, further h...
Yunlian Jiang, Eddy Z. Zhang, Kai Tian, Xipeng She...
Communication latencies within critical sections constitute a major bottleneck in some classes of emerging parallel workloads. In this paper, we argue for the use of Inferentially...
Performance evaluation of modern, highly speculative, out-of-order microprocessors and the corresponding production of detailed, valid, accurate results have become serious challe...
The decision tree is one of the most fundamental ing abstractions. A commonly used type of decision tree is the alphabetic binary tree, which uses (without loss of generality) &quo...