Transactional memory is an attractive design concept for scalable multiprocessors because it offers efficient lock-free synchronization and greatly simplifies parallel software....
This paper presents an automated performance tuning solution, which partitions a program into a number of tuning sections and finds the best combination of compiler options for e...
Packet forwarding is a memory-intensive application requiring multiple accesses through a trie structure. The efficiency of a cache for this application critically depends on the ...
Modern out-of-order processors with non-blocking caches exploit Memory-Level Parallelism (MLP) by overlapping cache misses in a wide instruction window. The exploitation of MLP, h...
— Understanding the behavior of emerging workloads is important for designing next generation microprocessors. For addressing this issue, computer architects and performance anal...