The implementation of bounded-delay services over integrated services networks relies admission control mechanisms that in turn use end-to-end delay computation algorithms. For gu...
The Merrimac supercomputer uses stream processors and a highradix network to achieve high performance at low cost and low power. The stream architecture matches the capabilities o...
Mattan Erez, Jung Ho Ahn, Ankit Garg, William J. D...
This paper investigates communication strategies for interconnecting heterogeneous parallel systems. As the speed of processors and parallel systems keep on increasing over the ye...
We describe three new Jacobi orderings for parallel computation of SVD problems on tree architectures. The rst ordering uses the high bandwidth of a perfect binary fat-tree to min...
By managing network resources at compile time, the compiled communication technique greatly improves the communication performance for communication patterns that are known at com...