Optimistic Parallelization of Floating-Point Accumulation

13 years 11 months ago

Download www.seas.upenn.edu

Abstract— Floating-point arithmetic is notoriously nonassociative due to the limited precision representation which demands intermediate values be rounded to ﬁt in the available precision. The resulting cyclic dependency in ﬂoating-point accumulation inhibits parallelization of the computation, including efﬁcient use of pipelining. In practice, however, we observe that ﬂoating-point operations are “mostly” associative. This observation can be exploited to parallelize ﬂoating-point accumulation using a form of optimistic concurrency. In this scheme, we ﬁrst compute an optimistic associative approximation to the sum and then relax the computation by iteratively propagating errors until the correct sum is obtained. We map this computation to a network of 16 statically-scheduled, pipelined, double-precision ﬂoating-point adders on the Virtex-4 LX160 (-12) device where each ﬂoating-point adder runs at 296 MHz and has a pipeline depth of 10. On this 16 PE design, we dem...

Nachiket Kapre, André DeHon

Real-time Traffic