Abstract This paper proposes a new fine-grained data distribution operation MPI Alltoall specific that allows an element-wise distribution of data elements to specific target pro...
In the past decade cluster computing platforms have been widely applied to support a variety of scientific and commercial applications, many of which are parallel in nature. Howev...
A drawing of a family of cuts of a graph is an augmented drawing of the graph such that every cut in the family is represented by a simple closed curve and vice versa. We show tha...
This paper describes a high performance sampling architecture for inference of latent topic models on a cluster of workstations. Our system is faster than previous work by over an...
- Multithreading aims to tolerate latency by overlapping communication with computation. This report explicates the multithreading capabilities of the EM-X distributed-memory multi...
Andrew Sohn, Yuetsu Kodama, Jui Ku, Mitsuhisa Sato...