Intersecting Ranges

Thursday, May 23, 2024

After Factor 0.99, we have continued to add new features to what is likely to be an upcoming release. One of those that I would like to discuss today was a patch to add set methods to ranges that was contributed by Keldan Chapman. It was a really nice addition that makes our ranges support efficient set operations.

And that brings us to the topic of today’s blog post: intersecting ranges.

Specifically, a discussion that occurred over code style and I thought it would be interesting to see 3 different approaches to writing a reasonably complex intersecting ranges word in Factor.

You can see this in action, where intersect returns a range in the latest development version:

! Factor 0.99
IN: scratchpad 2 40 2 <range>
               1 40 3 <range> intersect .
V{ 4 10 16 22 28 34 40 }

! Factor 0.100 (git)
IN: scratchpad 2 40 2 <range>
               1 40 3 <range> intersect .
T{ range { from 4 } { length 7 } { step 6 } }

Some inspiration came from looking at the Julia programming language and their intersect function.

Local Variables

The first version used local variables and after a smidge of cleanup, looked pretty good:

:: intersect-range ( range1 range2 -- range3 )
    range1 empty? range2 empty? or [ empty-range ] [
        range1 >forward-range< :> ( start1 stop1 step1 )
        range2 >forward-range< :> ( start2 stop2 step2 )
        step1 step2 gcd :> ( x g )
        start1 start2 - g /mod :> ( z y )
        y zero? not [ empty-range ] [
            start1 x z step1 * * - :> b
            step1 step2 lcm :> a
            start1 start2 [ b over - a rem + ] bi@ max :> m
            stop1  stop2  [  dup b - a rem - ] bi@ min :> n
            m n a <range>
        ] if
    ] if ;

Stack Shuffling

Due to a bug with using local definitions in a bootstrap image, we temporarily switched to a version that used stack shuffling. As you can see – and is often the case when you are dealing with a large number of items on the stack – it isn’t particularly readable.

: intersect-range ( range1 range2 -- range3 )
    2dup [ empty? ] either? [ 2drop empty-range ] [
        [ >forward-range< ] bi@ [ -rot ] [ swap ] [ ] tri* [
            [ reach reach - ] 2dip gcd swap [ /mod ] dip swap
        ] 2keep rot zero? not [
            4drop 4drop empty-range
        ] [
            [ reach ] 4dip dupd lcm [ * * - ] dip [
                [ '[ [ _ over - _ rem + ] bi@ max ] 2dip ]
                [ '[ dup _ - _ rem - ] bi@ min ] 2bi
            ] keep <range>
        ] if
    ] if ;

Small Words

Usually when stack shuffling doesn’t help, sometimes extracting out pieces of logic into words and naming them can often result in a simpler or cleaner version. It did turn out a bit easier to read, but is still overly shuffly.

: explode-ranges ( range1 range2 -- start1 start2 stop1 stop2 step1 step2 )
    [ >forward-range< ] bi@ [ -rot ] [ swap ] [ ] tri* ;

: compute-z-y-x ( start1 start2 step1 step2 -- z y x )
    gcd [ - ] 2dip swap [ /mod ] dip ;

: compute-b-a ( start1 x z step1 step2 -- b a )
    dupd lcm [ * * - ] dip ;

: intersected-range ( start1 start2 stop1 stop2 b a -- range )
    [
        [ '[ [ _ over - _ rem + ] bi@ max ] 2dip ]
        [ '[ dup _ - _ rem - ] bi@ min ] 2bi
    ] keep <range> ;

: intersect-range ( range1 range2 -- range3 )
    2dup [ empty? ] either? [ 2drop empty-range ] [
        explode-ranges [
            [ reach reach ] 2dip compute-z-y-x swap
        ] 2keep rot zero? not [
            4drop 4drop empty-range
        ] [
            [ reach ] 4dip compute-b-a intersected-range
        ] if
    ] if ;

Well, likely the first version was the nicest, but it’s an interesting exercise to think about readability, verbosity, and concatenative thinking. Sometimes you need to puzzle through a piece of code to find the one you like the best.