Re: Factor

Factor: the language, the theory, and the practice.

The 3n+1 Problem

Thursday, October 25, 2012

#math

Yesterday, a lengthy blog post from the Racket community discussed the “3n+1” problem (also known as the Collatz conjecture):

Consider a positive number n. The cycle length of n is the number of times we repeat the following, until we reach n=1:

  • If n is odd, then n = 3n+1
  • If n is even, then n = n/2

For example, given n=22, we see the following sequence: 22 11 34 17 52 26 13 40 20 10 5 16 8 4 2 1. The cycle length of 22 is, therefore, 16, since it took 16 repetitions to get to 1.

Given a definition of cycle length of a number, here’s the rest of the problem: given any two numbers i and j, compute the maximum cycle length over all numbers between i and j, inclusive.

Solution

Solving this in Factor could start by defining a word to take a number and compute its next value in the cycle. Since this is an internal word, we won’t add the check for the end condition of n=1:

: next-cycle ( x -- y )
    dup odd? [ 3 * 1 + ] [ 2 / ] if ;

Creating the “cycle sequence” for a given number can be done by:

: cycles ( n -- seq )
    [ dup 1 > ] [ [ next-cycle ] keep ] produce swap suffix ;

We could always find the “cycle length” by calling cycles length, but a slightly faster way would not create an intermediate sequence:

: cycle-length ( n -- m )
    1 [ over 1 > ] [ [ next-cycle ] [ 1 + ] bi* ] while nip ;

We can find the maximum cycle (number and it’s cycle length) for any given sequence of numbers:

: max-cycle ( seq -- elt )
    [ dup cycle-length ] { } map>assoc [ second ] supremum-by ;

For convenience, we can make some helper words to find either the number and/or the maximum cycle length:

: max-cycle-value ( seq -- n ) max-cycle first ;

: max-cycle-length ( seq -- m ) max-cycle second ;

Testing

Factor strives to be a concise language, some of which is natural for a concatenative style. You can see this in the simple approach to unit testing:.

{
    { 22 11 34 17 52 26 13 40 20 10 5 16 8 4 2 1 }
} [ 22 cycles ] unit-test

{ 1  } [ 1 cycle-length ] unit-test
{ 2  } [ 2 cycle-length ] unit-test
{ 6  } [ 5 cycle-length ] unit-test
{ 16 } [ 22 cycle-length ] unit-test

{ 20  } [ 1 10 [a..b] max-cycle-length ] unit-test
{ 125 } [ 100 200 [a..b] max-cycle-length ] unit-test
{ 89  } [ 201 210 [a..b] max-cycle-length ] unit-test
{ 174 } [ 900 1000 [a..b] max-cycle-length ] unit-test

Performance

The original post discussed performance. Since many cycles have numbers in common, we could change our approach to a recursive one, with memoization:

MEMO: fast-cycle-length ( n -- m )
    dup 1 > [ next-cycle fast-cycle-length 1 + ] [ drop 1 ] if ;

Indeed, this takes roughly 15-20% of the time that our previous approach took to compute the number of cycles for the numbers 1 to 100,000 with a cold cache.

Main

Like the original solution, we can add a “main” to allow running from the command line:

: run-cycles ( -- )
    [
        [ blank? ] split-when harvest first2
        [ string>number ] bi@ 2dup [a..b] max-cycle-length
        "%s %s %s\n" printf
    ] each-line ;

MAIN: run-cycles

And using the original sample file, produce the same output:

$ cat sample-data.txt 
1 10
100 200
201 210
900 1000

$ factor cycles.factor < sample-data.txt
1 10 20
100 200 125
201 210 89
900 1000 174

The code for this is available on my GitHub.