The efforts of an expert to parallelize and optimize a dense linear algebra algorithm for distributed-memory targets are largely mechanical and repetitive. We demonstrate that the...
Bryan Marker, Andy Terrel, Jack Poulson, Don S. Ba...
Abstract. In this article, we propose new parallel algorithms for the construction and 2:1 balance refinement of large linear octrees on distributed memory machines. Such octrees a...
We had introduced the massively parallel global cellular automata (GCA) model. Parallel algorithms derived from applications can be mapped straight forward onto this model. In thi...
Adapting to the network is the key to achieving high performance for communication-intensive applications, including scientific computing, data intensive computing, and multicast...