Ring interconnects may be an attractive solution for future chip multiprocessors because they can enable faster links than buses and simpler switches than arbitrary switched inter...
This paper introduces a cost-effective technique to deal with CMP coherence protocol requirements from the interconnection network point of view. A mechanism is presented to avoid...
Both commercial and scientific workloads benefit from concurrency and exhibit data sharing across threads/processes. The resulting sharing patterns are often fine-grain, with t...
Hemayet Hossain, Sandhya Dwarkadas, Michael C. Hua...
We studied the dynamic instruction count reduction for a single-thread, vectorized and a multi-threaded, non-vectorized, MPEG-4 video encoder. Results indicate a maximum improveme...
Realizing scalable cache coherence in the many-core era comes with a whole new set of constraints and opportunities. It is widely believed that multi-hop, unordered on-chip networ...