Re: Factor

Factor: the language, the theory, and the practice.

Argument Parser

Tuesday, May 21, 2024

#command-line

Recently, some discussions on our Factor Discord server reminded me that we were missing an important feature for parsing command-line arguments: an argument parser.

An argument parser is something that parses structured command-line arguments and is often helpful in printing command-line usage information. I thought it would be useful to go over a few different ways to parse command-line arguments using Factor.

Version 0

In our most recent release, command-line parsing was kind of manual and idiosyncratic. For example, this is how arguments were parsed for the STOMP command-line interface that I built recently:

: stomp-options ( args -- )
    [
        unclip >lower {
            { [ dup { "-h" "--host" } member? ] [ unclip stomp-host set-global ] }
            { [ dup { "-p" "--port" } member? ] [ unclip string>number stomp-port set-global ] }
            { [ dup { "-u" "--username" } member? ] [ unclip stomp-username set-global ] }
            { [ dup { "-w" "--password" } member? ] [ unclip stomp-password set-global ] }
        } cond stomp-options
    ] unless-empty ;

Version 1

The Factor binary already does some amount of argument parsing, and so I thought I would extend the command-line vocabulary to provide some support for simple options parsing. Specifically, I added a command-line-options word that could be used like so:

: stomp-options ( args -- )
    command-line-options drop
    "host" get "127.0.0.1" or stomp-host set-global
    "port" get [ string>number ] [ 61613 ] if* stomp-port set-global
    "username" get stomp-username set-global
    "password" get stomp-password set-global ;

This uses our existing parameter parsing code, but stores the parsed options as string keys and string or boolean values in the namespace, doesn’t easily allow for boolean values that default to true, and doesn’t do any conversion or validation of arguments.

Version 2

I was finally able to build something better that could serve our users. It is modeled after the Python argparse module. Using the newly developed command-line.parser vocabulary, we can now define some option objects to create the original example:

{
    T{ option
        { name "--host" }
        { help "set the hostname" }
        { type ipv4 }
        { variable stomp-host }
        { default T{ ipv4 f "127.0.0.1" } }
    }
    T{ option
        { name "--port" }
        { help "set the port" }
        { type integer }
        { variable stomp-port }
        { default 61613 }
    }
    T{ option
        { name "--username" }
        { help "set the username" }
        { variable stomp-username }
    }
    T{ option
        { name "--password" }
        { help "set the password" }
        { variable stomp-password }
    }
} [
    stomp-host get .
    stomp-port get .
    stomp-username get .
    stomp-password get .
] with-options

This now allows for better default values and automatic --help output:

$ ./factor -run=stomp.cli --help
Usage:
    factor -run=stomp.cli [options] [arguments]

Options:
    --help                 show this help and exit
    --host HOST            set the hostname (default: 127.0.0.1)
    --port PORT            set the port (default: 61613)
    --username USERNAME    set the username
    --password PASSWORD    set the password

And some automatic error checking:

$ ./factor -run=stomp.cli --port
ERROR: Expected more arguments for option 'port'

$ ./factor -run=stomp.cli --port asdf
ERROR: Invalid value 'asdf' for option 'port'

$ ./factor -run=stomp.cli --prot
ERROR: Unknown option 'prot'

$ ./factor -run=stomp.cli a b c d
ERROR: Unrecognized arguments: a b c d

$ ./factor -run=stomp.cli -h
ERROR: The option 'h' matches more than one (host, help)

Additionally, this includes some features such as fuzzy matching of options, validation that all required options are provided, constant values when an option is specified, support for type coercion and type validation of option values, support for mixing optional and positional arguments, as well as specifying the number of expected arguments that an option requires.

Some things that I would still like to build include support for short option codes, exit codes when argument parsing errors occur, and perhaps support for docopt style declarations.

This is available in the development version – give it a try!