Re: Factor

Factor: the language, the theory, and the practice.

Kahan Summation

Monday, October 14, 2013

#math

The Kahan summation algorithm is a “compensated summation” that “significantly reduces the numerical error in the total obtained by adding a sequence of finite precision floating point numbers, compared to the obvious approach”.

Matt Adereth wrote a blog post a few days ago demonstrated some advantages of using Kahan, which I show below using Factor.

The Problem

To demonstrate the problem, let’s consider the harmonic series (e.g., 1 + 1/2 + 1/3 + 1/4 + ...) as a series of rational numbers:

: harmonic-ratios ( n -- seq )
    [1..b] [ recip ] map ;

It gives the first n of the harmonic series:

IN: scratchpad 6 harmonic-ratios .
{ 1 1/2 1/3 1/4 1/5 1/6 }

We define a “simple sum” as just adding the numbers in order:

: simple-sum ( seq -- n )
    0 [ + ] reduce ;

The simple sum of the first 10,000 harmonic ratios as a floating point number is:

IN: scratchpad 10,000 harmonic-ratios simple-sum >float .
9.787606036044382

But, if we use floating points instead of ratios to represent the harmonic numbers (e.g., 1.0 + 0.5 + 0.33333333 + 0.25 + ...):

: harmonic-floats ( n -- seq )
    harmonic-ratios [ >float ] map! ;

You can see that an error has been introduced (ending in “48” instead of “82”):

IN: scratchpad 10,000 harmonic-floats simple-sum .
9.787606036044348

If we reverse the sequence and add them from smallest to largest, there is a slightly different error (ending in “86”):

IN: scratchpad 10,000 harmonic-floats reverse simple-sum .
9.787606036044386

The Solution

The pseudocode for the Kahan algorithm can be seen on the Wikipedia page, using a running compensation for lost low-order bits:

function KahanSum(input)
    var sum = 0.0
    var c = 0.0
    for i = 1 to input.length do
        var y = input[i] - c
        var t = sum + y
        c = (t - sum) - y
        sum = t
    return sum

We could translate this directly using local variables to hold state, but instead I tried to make it a bit more concatenative (and possibly harder to read in this case):

: kahan+ ( c sum elt -- c' sum' )
    rot - 2dup + [ -rot [ - ] bi@ ] keep ;

: kahan-sum ( seq -- n )
    [ 0.0 0.0 ] dip [ kahan+ ] each nip ;

You can see both forward and backward errors no longer exist:

IN: scratchpad 10,000 harmonic-floats kahan-sum .
9.787606036044382

IN: scratchpad 10,000 harmonic-floats reverse kahan-sum .
9.787606036044382

Nifty! Thanks, Matt!