The performance of both serial and parallel implementations of matrix multiplication is highly sensitive to memory system behavior. False sharing and cache conflicts cause traditi...
Siddhartha Chatterjee, Alvin R. Lebeck, Praveen K....
Matrix multiplication is an important kernel in linear algebra algorithms, and the performance of both serial and parallel implementations is highly dependent on the memory system...
Siddhartha Chatterjee, Alvin R. Lebeck, Praveen K....
Abstract. A new algorithm is presented for detecting whether a particular computation of an asynchronous distributed system satisfies Poss Φ (read “possibly Φ”), meaning the...
We introduce dag consistency, a relaxed consistency model for distributed shared memory which is suitable for multithreaded programming. We have implemented dag consistency in sof...
Robert D. Blumofe, Matteo Frigo, Christopher F. Jo...
In this paper, an adaptive matrix multiplication algorithm for dynamic heterogeneous environments is developed and evaluated. Unlike the state-of-the-art approaches, where load ba...