Sciweavers

CLUSTER
2001
IEEE

Using Multirail Networks in High-Performance Clusters

13 years 8 months ago
Using Multirail Networks in High-Performance Clusters
Using multiple independent networks (also known as rails) is an emerging technique to overcome bandwidth limitations and enhance fault tolerance of current high-performance parallel computers. In this paper we present and analyze various algorithms to allocate multiple communication rails, including static and dynamic allocation schemes. An analytical lower bound on the number of rails required for static rail allocation is shown. We also present an extensive experimental comparison of the behavior of various algorithms in terms of bandwidth and latency. We show that striping messages over multiple rails can substantially reduce network latency, depending on average message size, network load, and allocation scheme. The compared methods include a static rail allocation, a basic round-robin rail allocation, a local-dynamic allocation based on local knowledge, and a dynamic rail allocation that reserves both communication endpoints of a message before sending it. The last method is show...
Salvador Coll, Eitan Frachtenberg, Fabrizio Petrin
Added 23 Aug 2010
Updated 23 Aug 2010
Type Conference
Year 2001
Where CLUSTER
Authors Salvador Coll, Eitan Frachtenberg, Fabrizio Petrini, Adolfy Hoisie, Leonid Gurvits
Comments (0)