Sciweavers

IPPS
1999
IEEE

Performance Results for a Reliable Low-Latency Cluster Communication Protocol

13 years 8 months ago
Performance Results for a Reliable Low-Latency Cluster Communication Protocol
Existing low-latency protocols make unrealistically strong assumptions about reliability. This allows them to achieve impressive performance, but also prevents this performance being exploited by applications, which must then deal with reliability issues in the application code. We present results from a new protocol that provides error recovery, and whose performance is close to that of existing low-latency protocols. We achieve a CPU overhead of 1:5s for packet download and 3:6s for upload. Our results show that a executing a protocol in the kernel is not incompatible with high performance, and b complete control over the protocol stack enables 1 simple forms of ow control to be adopted, 2 proper bracketing of the unreliable portions of the interconnect thus minimising bu ers held up for possible recovery, and 3 the sharing of bu er pools. The result is a protocol which performs well in the context of parallel computation and the loose coupling of processes in the workstations of a c...
Stephen R. Donaldson, Jonathan M. D. Hill, David B
Added 03 Aug 2010
Updated 03 Aug 2010
Type Conference
Year 1999
Where IPPS
Authors Stephen R. Donaldson, Jonathan M. D. Hill, David B. Skillicorn
Comments (0)