— This paper presents a tool, the Performance Model Manager, which addresses the complexity of the construction and management of a set of Functional Performance Models on a comp...
We present a novel design and implementation of relational join algorithms for new-generation graphics processing units (GPUs). The most recent GPU features include support for wr...
Bingsheng He, Ke Yang, Rui Fang, Mian Lu, Naga K. ...
—Remote atomic memory operations are critical for achieving high-performance synchronization in tightly-coupled systems. Previous approaches to implementing atomic memory operati...
Keith D. Underwood, Michael Levenhagen, K. Scott H...
There is a strong need now for compilers of embedded systems to find effective ways of optimizing series of loop-nests, wherein majority of the memory references occur in the fo...
Javed Absar, Min Li, Praveen Raghavan, Andy Lambre...
This paper presents a high-performance Distributed Shared Memory system called VODCA, which supports a novel View-Oriented Parallel Programming on cluster computers. One advantage...
Zhiyi Huang, Wenguang Chen, Martin K. Purvis, Weim...