For bulk synchronous computations that have nondeterministic behaviors, dynamic remapping is an effective approach to ensure parallel efficiency. There are two basic issues in re...
Torus/mesh-based machines have received increasing attention. It is natural to identify the maximum healthy submeshes in a faulty torus/mesh so as to lower potential performance d...
In this paper, we propose a semi distributed approach, for load balancing in large parallel and distributedsystems. Theproposedschemeisa twolevel hierarchical scheme which partiti...
We present the algorithm to multiply univariate polynomials with integer coefficients efficiently using the Number Theoretic transform (NTT) on Graphics Processing Units (GPU). The...
Efficient data movement is an important part of any highperformance I/O system, but it is especially critical for the current and next-generation of massively parallel processing ...
Ron Oldfield, Patrick Widener, Arthur B. Maccabe, ...