Fast hardware turnover in supercomputing centers, stimulated by rapid technological progress, results in high heterogeneity among HPC platforms, and necessitates that applications...
High-performance input-queued switches require highspeed scheduling algorithms while maintaining good performance. Various round-robin scheduling algorithms for Virtual Output Que...
Jing Liu, Chun Kit Hung, Mounir Hamdi, Chi-Ying Ts...
While a number of User-Level Protocols have been developed to reduce the gap between the performance capabilities of the physical network and the performance actually available, a...
Pavan Balaji, Piyush Shivam, Pete Wyckoff, Dhabale...
Modern microprocessors can achieve high performance on linear algebra kernels but this currently requires extensive machine-speci c hand tuning. We have developed a methodology wh...
Jeff Bilmes, Krste Asanovic, Chee-Whye Chin, James...
Several recent papers have proposed or analyzed optimal algorithms to route all-to-all personalizedcommunication (AAPC) over communication networks such as meshes, hypercubes and ...