Abstract. Most parallel systems on which MPI is used are now hierarchical: some processors are much closer to others in terms of interconnect performance. One of the most common su...
Hao Zhu, David Goodell, William Gropp, Rajeev Thak...
The constant race for faster and more powerful CPUs is drawing to a close. No longer is it feasible to significantly increase the speed of the CPU without paying a crushing penalt...
Yaron Weinsberg, Danny Dolev, Tal Anker, Muli Ben-...
In recent years, microprocessor manufacturers have shifted their focus from single-core to multi-core processors. To avoid burdening programmers with the responsibility of paralle...
Neil Vachharajani, Ram Rangan, Easwaran Raman, Mat...
1 This paper presents a simple but effective method to reduce on-chip access latency and improve core isolation in CMP Non-Uniform Cache Architectures (NUCA). The paper introduces ...
Javier Merino, Valentin Puente, Pablo Prieto, Jos&...
We study the hardware cost of implementing hash-tree based verification of untrusted external memory by a high performance processor. This verification could enable applications s...
Blaise Gassend, G. Edward Suh, Dwaine E. Clarke, M...