## Compressed Sieve of Eratosthenes

Wednesday, June 17, 2015

In my previous post, we used “2-3-5-7” wheel factorization to produce a faster Sieve of Eratosthenes with a compression factor of 2 (by storing only the odd numbers). Samuel Tardieu reminded me that the sieve variation used by Factor uses less memory with a higher compression factor of 3.75.

I thought it would be a fun implementation to show below.

### Version 6

The “2-3-5” wheel with the first 30 numbers that are not divisible by 2, 3, or 5.

```
CONSTANT: wheel-2-3-5 $[
30 [1..b] [
{ 2 3 5 } [ divisor? ] with any? not
] B{ } filter-as
]
```

Note: We use 30 here because the “`{ 2 3 5 } product`

” cycle is 30.

If you look at the wheel, you see that there are 8 candidates in every 30 numbers:

```
IN: scratchpad wheel-2-3-5 .
{ 1 7 11 13 17 19 23 29 }
```

The number `8`

is interesting because it suggests that we can use a
single byte to store the 8 candidates in each block of 30 numbers, using
a bitmask for possible
primes or `f`

:

```
CONSTANT: masks $[
30 [0,b) [
wheel-2-3-5 index [ 7 swap - 2^ ] [ f ] if*
] map
]
```

We can index into this array to find the byte/mask for every number:

```
: byte-mask ( n -- byte mask/f )
30 /mod masks nth ;
```

Using the byte/mask we can check if a number is marked in the sieve byte-array.

```
:: marked? ( n sieve -- ? )
n byte-mask :> ( byte mask )
mask [ byte sieve nth mask bitand zero? not ] [ f ] if ;
```

And we can unmark a number in the sieve in a similar fashion:

```
:: unmark ( n sieve -- )
n byte-mask :> ( byte mask )
mask [ byte sieve [ mask bitnot bitand ] change-nth ] when ;
```

Using that, we can unmark multiples of prime numbers:

```
:: unmark-multiples ( i upper sieve -- )
i sq upper i 2 * <range> [ sieve unmark ] each ;
```

That’s all we need to build our sieve, starting from a byte-array initialized with all numbers marked and then progressively unmarking multiples of primes:

```
:: sieve6 ( n -- sieve )
n 30 /i 1 + [ 0xff ] B{ } replicate-as :> sieve
sieve length 30 * 1 - :> upper
3 upper sqrt 2 <range> [| i |
i sieve marked? [
i upper sieve unmark-multiples
] when
] each sieve ;
```

As you can see, storing prime numbers up to 10 million takes only 333 KB!

```
IN: scratchpad 10,000,000 sieve6 length .
333334
```

This approach is used by the math.primes vocabulary to quickly check primality of any number less than 9 million using only 300 KB of memory.

Neat!