Abstract. This paper presents a study of performance optimization of dense matrix multiplication on IBM Cyclops-64(C64) chip architecture. Although much has been published on how t...
Ziang Hu, Juan del Cuvillo, Weirong Zhu, Guang R. ...
The primary focus of this project is to design and implement a parallel framework for an unstructured mesh generator based on the advancing front method (AFM). In particular, we t...
For an effective Internet-based distributed parallel computing platform, Java-Internet Computing Environment (JICE) is designed and implemented with multithreading and remote meth...
This paper examines the performance benefits of employing multicast communication and application-level multithreading in the Brazos software distributed shared memory (DSM) syste...
Abstract. We address the problem of developing efficient cache coherence protocols implementing distributed shared memory (DSM) using message passing. A serious drawback of traditi...