Sciweavers

ICS
2005
Tsinghua U.

System noise, OS clock ticks, and fine-grained parallel applications

13 years 10 months ago
System noise, OS clock ticks, and fine-grained parallel applications
As parallel jobs get bigger in size and finer in granularity, “system noise” is increasingly becoming a problem. In fact, fine-grained jobs on clusters with thousands of SMP nodes run faster if a processor is intentionally left idle (per node), thus enabling a separation of “system noise” from the computation. Paying a cost in average processing speed at a node for the sake of eliminating occasional processes delays is (unfortunately) beneficial, as such delays are enormously magnified when one late process holds up thousands of peers with which it synchronizes. We provide a probabilistic argument showing that, under certain conditions, the effect of such noise is linearly proportional to the size of the cluster (as is often empirically observed). We then identify a major source of noise to be indirect overhead of periodic OS clock interrupts (“ticks”), that are used by all general-purpose OSs as a means of maintaining control. This is shown for various grain sizes, p...
Dan Tsafrir, Yoav Etsion, Dror G. Feitelson, Scott
Added 27 Jun 2010
Updated 27 Jun 2010
Type Conference
Year 2005
Where ICS
Authors Dan Tsafrir, Yoav Etsion, Dror G. Feitelson, Scott Kirkpatrick
Comments (0)