Re: Factor

Factor: the language, the theory, and the practice.

Shuffle Syntax

Sunday, May 18, 2025

#syntax

Some might describe shuffle words as one of the fundamental building blocks of Factor. Others might describe them as a code smell and seek to use dataflow combinators or other higher-level words to reduce code complexity.

Whatever your opinion is, they are useful concepts in a concatenative language. Besides the basic shuffle words – like dup, swap, rot – we have had the shuffle vocabulary which provides some “additional shuffle words” for awhile, as well as a syntax word that can perform arbitrary shuffles:

IN: scratchpad USE: shuffle

IN: scratchpad { 1 2 3 4 5 } [
                   shuffle( a b c d e -- d e c a b )
               ] with-datastack .
{ 4 5 3 1 2 }

This would be quite useful, except that it has had a fundamental issue – the way it is implemented uses a macro to curry the stack arguments into an array, and then pull the stack arguments back out of the array onto the stack in the requested order.

For example, we can look at a simple swap and a complex shuffle:

IN: scratchpad [ shuffle( x y -- y x ) ] optimized.
[
    2 0 <array> dup >R >R R> 3 set-slot
    R> >R R> dup >R >R R> 2 set-slot
    R> >R R> dup >R >R R> >R R> 3 slot R> >R R> >R R> 2 slot
]

IN: scratchpad [ shuffle( a b c d e -- b a d c e ) ] optimized.
[
    5 0 <array> dup >R >R R> 6 set-slot
    R> >R R> dup >R >R R> 5 set-slot
    R> >R R> dup >R >R R> 4 set-slot
    R> >R R> dup >R >R R> 3 set-slot
    R> >R R> dup >R >R R> 2 set-slot
    R> >R R> dup >R >R R> >R R> 3 slot
    R> dup >R >R R> >R R> 2 slot R> dup >R >R R> >R R> 5 slot
    R> dup >R >R R> >R R> 4 slot R> >R R> >R R> 6 slot
]

And not only would this be less-than-efficient, it would also turn literal arguments that were on the stack into run-time arguments and potentially cause a Cannot apply 'call' to a run-time computed value error if one of the shuffled arguments is a quotation they hope to use.

This bug was described on our issue tracker and I spent some time recently looking into it.

It turns out that we can use the stack checker to indicate that a shuffle is taking place, and use some "special" machinery to allow the optimizing compiler to generate efficient and correct code for these arbitrary shuffles.

After applying a small fix, we can see that the earlier examples are now quite simple:

IN: scratchpad [ shuffle( x y -- y x ) ] optimized.
[ swap ]

IN: scratchpad [ shuffle( a b c d e -- b a d c e ) ] optimized.
[
    ( 10791205 10791206 10791207 10791208 10791209 -- 10791206 10791205 10791208 10791207 10791209 )
]

This is available in the latest development version.