# Re: Factor

Factor: the language, the theory, and the practice.

## Optimizing 2^x

Wednesday, October 12, 2011

A great little article was posted last year about optimizing 2^x for doubles (by approximation). The author gets 40+% performance improvements in C#. I wondered if we could get similar improvements in Factor.

The basic idea is to use the fact that a number, `ab+c` can be factored into `ab * ac`. If we separate a floating point number into two components: an integer and a fractional part, we can show that:

`2n = 2integer(n) * 2fractional(n)`.

We are going to approximate this value by using a lookup table to compute the fractional part (within a specified precision). For example, to compute within a 0.001 precision, we need 1000 lookup values, essentially performing this calculation:

``````2n = ( 1 << Int(n) ) * Table[ (int) ( Frac(n) * 1000 ) ];
``````

### Implementation

So, we need a word that can split a floating point number into those two values:

``````: float>parts ( x -- float int )
dup >integer [ - ] keep ; inline
``````

Instead of one table with 1000 values, we will copy the original authors decision to use three lookup tables for additional precision. The following code calculates these lookup tables:

``````CONSTANT: BITS1 10
CONSTANT: BITS2 \$[ BITS1 2 * ]
CONSTANT: BITS3 \$[ BITS1 3 * ]

CONSTANT: PRECISION1 \$[ 1 BITS1 shift ]
CONSTANT: PRECISION2 \$[ 1 BITS2 shift ]
CONSTANT: PRECISION3 \$[ 1 BITS3 shift ]

CONSTANT: MASK \$[ PRECISION1 1 - ]

CONSTANT: FRAC1 \$[ 2 PRECISION1 <iota> [ PRECISION1 / ^ ] with map ]
CONSTANT: FRAC2 \$[ 2 PRECISION1 <iota> [ PRECISION2 / ^ ] with map ]
CONSTANT: FRAC3 \$[ 2 PRECISION1 <iota> [ PRECISION3 / ^ ] with map ]
``````

The function `pow2` looks pretty similar to our original mathematical definition:

``````: pow2 ( n -- 2^n )
>float 2^int 2^frac * >float ;
``````

The guts of the implementation is in the `2^int` and `2^frac` words:

``````: 2^int ( n -- 2^int frac )
[ float>parts ] keep 0 >= [ 1 swap shift ] [
over 0 < [ [ 1 + ] [ 1 - ] bi* ] when
1 swap neg shift 1.0 swap /
] if swap ; inline

: 2^frac ( frac -- 2^frac )
PRECISION3 * >fixnum
[ BITS2 neg shift FRAC1 nth-unsafe ]
[ BITS1 neg shift MASK bitand FRAC2 nth-unsafe ]
[ MASK bitand FRAC3 nth-unsafe ] tri * * ; inline
``````

### Testing

Let’s try it and see how well it works for small values:

``````IN: scratchpad 2 1.5 ^ .
2.82842712474619

2.82842712474619
``````

It seem’s to work, how about larger values:

``````IN: scratchpad 2 16.3 ^ .
80684.28027297248

80684.28026255539
``````

The error is clearly detectable, but to test that it really works the way we expect it too, we will need to calculate relative error:

``````: relative-error ( approx value -- relative-error )
[ - abs ] keep / ;
``````

Our test case will generate random values, compute 2x using our approximation and verify the relative error is less than 0.000000001 when compared with the correct result:

``````[ t ] [
10000 [ -20 20 uniform-random-float ] replicate
[ [ pow2 ] [ 2 swap ^ ] bi relative-error ] map
supremum 1e-9 <
] unit-test
``````

And to verify performance, we will benchmark it against the built-in (and more accurate `^` word):

``````: pow2-test ( seq -- new old )
[ [ pow2 drop ] [ each ] benchmark ]
[ 2 swap [ ^ drop ] with [ each ] benchmark ] bi ;
``````

We got almost a 40% performance improvement at the cost of a small loss of precision - not bad!

``````IN: scratchpad 10000 [ -20 20 uniform-random-float ] replicate
pow2-test / >float .
0.6293153754782392
``````

The code for this is on my GitHub.