Re: Factor

Factor: the language, the theory, and the practice.

Even More Brainf*ck

Saturday, May 17, 2025

#compiler #ebnf

I was distracted a little by some recent explorations building a Brainfuck interpreter in Factor and had a couple of follow-ups to add to the conversation.

First, I realized my initial quick-and-dirty Brainfuck interpreter didn’t support nested loops. Specifically, the logic for beginning or ending a loop just searched for the nearest [ or ] character without considering nesting. This was fixed today so that will no longer be an issue.

Second, despite the Brainfuck compiler implicitly making an AST (abstract syntax tree) for Brainfuck by virtue of generating a quotation, I thought it would be more fun to build and generate one intentionally.

We can model the Brainfuck commands as operations using the following tuples and singletons:

TUPLE: ptr n ;
TUPLE: mem n ;
SINGLETONS: output input debug ;
TUPLE: loop ops ;

Next, we can build a parser using EBNF to convert the textual commands into our Brainfuck AST:

EBNF: ast-brainfuck [=[

inc-ptr  = (">")+     => [[ length ptr boa ]]
dec-ptr  = ("<")+     => [[ length neg ptr boa ]]
inc-mem  = ("+")+     => [[ length mem boa ]]
dec-mem  = ("-")+     => [[ length neg mem boa ]]
output   = "."        => [[ output ]]
input    = ","        => [[ input ]]
debug    = "#"        => [[ debug ]]
space    = [ \t\n\r]+ => [[ f ]]
unknown  = (.)        => [[ "Invalid input" throw ]]

ops   = inc-ptr|dec-ptr|inc-mem|dec-mem|output|input|debug|space
loop  = "[" {loop|ops}+ "]" => [[ second sift loop boa ]]

code  = (loop|ops|unknown)* => [[ sift ]]

]=]

This is interesting, because now we can more easily analyze a piece of Brainfuck code, such as the Hello, World example that I have been frequently using:

IN: scratchpad "
               ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
               >++.>+.+++++++..+++.>++.<<+++++++++++++++
               .>.+++.------.--------.>+.>.
               " ast-brainfuck .
V{
    T{ mem { n 10 } }
    T{ loop
        { ops
            V{
                T{ ptr { n 1 } }
                T{ mem { n 7 } }
                T{ ptr { n 1 } }
                T{ mem { n 10 } }
                T{ ptr { n 1 } }
                T{ mem { n 3 } }
                T{ ptr { n 1 } }
                T{ mem { n 1 } }
                T{ ptr { n -4 } }
                T{ mem { n -1 } }
            }
        }
    }
    T{ ptr { n 1 } }
    T{ mem { n 2 } }
    output
    T{ ptr { n 1 } }
    T{ mem { n 1 } }
    output
    T{ mem { n 7 } }
    output
    output
    T{ mem { n 3 } }
    output
    T{ ptr { n 1 } }
    T{ mem { n 2 } }
    output
    T{ ptr { n -2 } }
    T{ mem { n 15 } }
    output
    T{ ptr { n 1 } }
    output
    T{ mem { n 3 } }
    output
    T{ mem { n -6 } }
    output
    T{ mem { n -8 } }
    output
    T{ ptr { n 1 } }
    T{ mem { n 1 } }
    output
    T{ ptr { n 1 } }
    output
}

And then we can implement those operations against a brainfuck state object, by deferring to words from our current implementation:

GENERIC: op ( brainfuck op -- brainfuck )
M: ptr op n>> (>) ;
M: mem op n>> (+) ;
M: output op drop (.) ;
M: input op drop (,) ;
M: debug op drop (#) ;
M: loop op [ get-memory zero? ] swap ops>> '[ _ [ op ] each ] until ;

And now this Brainfuck AST represents a hybrid execution model somewhere between the compiled and interpreted versions:

: hybrid-brainfuck ( code -- )
    [ <brainfuck> ] dip ast-brainfuck [ op ] each drop ;

And see that it works:

IN: scratchpad "
               ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
               >++.>+.+++++++..+++.>++.<<+++++++++++++++
               .>.+++.------.--------.>+.>.
               " hybrid-brainfuck
Hello World!

We also gain some potential for building code optimization techniques that operate on an AST as a step before actual compilation or execution – for example, coalescing adjacent increment and decrement operations or some other more complex analysis.

That, however, is likely to remain an exercise for the reader!