Sieve of Eratosthenes
Sunday, June 7, 2015
I noticed that the Crystal Programming Language homepage has an example of the Sieve of Eratosthenes algorithm for finding prime numbers. I thought it would be fun to show a few simple variations of that algorithm in Factor.
The algorithm works by iteratively marking as not prime the multiples of each prime, starting with the multiples of 2. In pseudocode, it looks like this:
Input: an integer n > 1
Let A be an array of Boolean values, indexed by integers 2 to n,
initially all set to true.
for i = 2, 3, 4, ..., not exceeding √n:
if A[i] is true:
for j = i2, i2+i, i2+2i, i2+3i, ..., not exceeding n :
A[j] := false
Output: all i such that A[i] is true.
We will be looking at the performance and memory usage to calculate the 664,579 prime numbers between 1 and 10,000,000.
Version 1
Our first version, is a straightforward implementation from the pseudocode:
:: sieve1 ( n -- primes )
n 1 + t <array> :> sieve
f 0 sieve set-nth
f 1 sieve set-nth
2 n sqrt [a..b] [| i |
i sieve nth [
i sq n i <range> [| j |
f j sieve set-nth
] each
] when
] each
sieve [ drop ] selector [ each-index ] dip ;
We can show that it works for the first few prime numbers:
IN: scratchpad 25 sieve1 .
V{ 2 3 5 7 11 13 17 19 23 }
Calculating prime numbers up to 10 million takes less than half a second:
IN: scratchpad [ 10,000,000 sieve1 ] time
Running time: 0.408065579 seconds
The memory used is basically an array with 10 million elements in it, each element of the array is a boolean (64 bits on my laptop). Total memory used is 80 MB.
Version 2
Storing booleans can be more memory-efficient using bit-arrays, where each boolean is stored as a single bit. The logic is the same, however primes are marked by false rather than true.
:: sieve2 ( n -- primes )
n 1 + <bit-array> :> sieve
t 0 sieve set-nth
t 1 sieve set-nth
2 n sqrt [a..b] [| i |
i sieve nth [
i sq n i <range> [| j |
t j sieve set-nth
] each
] unless
] each
sieve [ drop not ] selector [ each-index ] dip ;
Calculating prime numbers up to 10 million is faster:
IN: scratchpad [ 10,000,000 sieve2 ] time
Running time: 0.241894524 seconds
The memory used is now one bit per number, so only 10 million bits or 1.25 MB.
Version 3
Since we know that even numbers (except 2
) are not prime, we can only
work with odd numbers and start our search from 3
. For each number
found, we can check only the odd multiples by marking off i2 + k*i
for
only even k
.
:: sieve3 ( n -- sieve )
n dup odd? [ 1 + ] when 2/ <bit-array> :> sieve
t 0 sieve set-nth
3 n sqrt 2 <range> [| i |
i 2/ sieve nth [
i sq n i 2 * <range> [| j |
t j 2/ sieve set-nth
] each
] unless
] each
V{ 2 } clone :> primes
sieve [
swap [ drop ] [ 2 * 1 + primes push ] if
] each-index
primes ;
Note: the convenience of selector is not available because we want to initialize the list of primes with
2
. Instead, we just do it manually.
Calculating prime numbers up to 10 million is much faster!
IN: scratchpad [ 10,000,000 sieve3 ] time
Running time: 0.110387127 seconds
As for memory usage, it is storing one bit for each odd number, so only 5 million bits or 625 KB.