As communication and I/O traffic increase on the interconnection network of high-performance systems, network contention becomes a critical problem drastically reducing performan...
This work presents and evaluates a novel processor microarchitecture which combines two paradigms: access/ execute decoupling and simultaneous multithreading. We investigate how b...
Advances in network technology continue to improve the communication performance of workstation and PC clusters, making high-performance workstation-clustercomputing increasingly ...
A router design for torus networks that significantly reduces message latency over traditional wormhole routers is presented in this paper. This new router implements virtual cut-...
The goal of this paper is to gain insight into the relative performance of communication mechanisms as bisection bandwidth and network latency vary. We compare shared memory with ...
Frederic T. Chong, Rajeev Barua, Fredrik Dahlgren,...