Automatic pattern search in event traces is a powerful method to identify performance problems in parallel applications. We demonstrate that knowledge about the virtual topology, ...
Nikhil Bhatia, Fengguang Song, Felix Wolf, Jack Do...
The Linux cluster considered in this paper, formed from shuttle box XPC nodes with 2 GHz Athlon processors connected by dual Gb Ethernet switches, is relatively easily constructed...
David J. Johnston, Martin Fleury, Michael Lincoln,...
This paper presents hardware and software mechanisms to enable concurrent direct network access (CDNA) by operating systems running within a virtual machine monitor. In a conventi...
Jeffrey Shafer, David Carr, Aravind Menon, Scott R...
Communicationin aparallel systemfrequently involvesmoving data from the memory of one node to the memory of another; this is the standard communication model employedin message pa...
Data locality is critical to achievinghigh performance on large-scale parallel machines. Non-local data accesses result in communication that can greatly impact performance. Thus ...