Current on-chip block-centric memory hierarchies exploit access patterns at the fine-grain scale of small blocks. Several recently proposed techniques for coherence traffic reduct...
Register allocation is one of the most important optimizations a compiler performs. Conventional graphcoloring based register allocators are fast and do well on regular, RISC-like...
Aggressive prefetching is very beneficial for memory latency tolerance of many applications. However, it faces significant challenges in multi-core systems. Prefetchers of diff...
We consider a hybrid wireless sensor network with regular and transmit-only sensors. The transmit-only sensors do not have receiver circuit, hence are cheaper and less energy consu...
Abstract. Recent research indicates that prediction-based coherence optimizations offer substantial performance improvements for scientific applications in distributed shared memor...
Stephen Somogyi, Thomas F. Wenisch, Nikolaos Harda...