Re: Factor

Factor: the language, the theory, and the practice.

Crontab

Wednesday, January 31, 2024

#parsing #time

Cron might be the latest, greatest, and coolest “next-generation calendar” as well as now a product called Notion Calendar. But in the good ol’ days, cron was instead known as:

The cron command-line utility is a job scheduler on Unix-like operating systems. Users who set up and maintain software environments use cron to schedule jobs (commands or shell scripts), also known as cron jobs, to run periodically at fixed times, dates, or intervals. It typically automates system maintenance or administration—though its general-purpose nature makes it useful for things like downloading files from the Internet and downloading email at regular intervals.

There are implementations of crond – the cron daemon – on most operating systems. Many of them have standardized on a crontab format that looks something like this:

# ┌───────────── minute (0–59)
# │ ┌───────────── hour (0–23)
# │ │ ┌───────────── day of the month (1–31)
# │ │ │ ┌───────────── month (1–12)
# │ │ │ │ ┌───────────── day of the week (0–6) (Sunday to Saturday;
# │ │ │ │ │                                   7 is also Sunday on some systems)
# │ │ │ │ │
# │ │ │ │ │
# * * * * * <command to execute>

At first (and sometimes second and third and fourth) glance, this looks a bit inscrutable, and so websites such as crontab guru pop up to help you unpack and explain when a cronentry is expected to be run.

I thought it would be fun to build a parser for these cronentries in Factor.

Let’s start by defining a cronentry type:

TUPLE: cronentry minutes hours days months days-of-week command ;

For each component, there is a variety of allowed inputs:

  • all values in the range: *
  • list of values: 3,5,7
  • range of values: 10-15
  • step values: 1-20/5
  • random value in range: 10~30

We build a parse-value word that will take an input string, a quot to parse the input, and a seq of possible values, as well as a parse-range word to help with optional starting and ending input values.

:: parse-range ( from/f to/f quot: ( input -- value ) seq -- from to )
    from/f [ seq first ] quot if-empty
    to/f [ seq last ] quot if-empty ; inline

:: parse-value ( input quot: ( input -- value ) seq -- value )
    input {
        { [ dup "*" = ] [ drop seq ] }

        { [ CHAR: , over member? ] [
            "," split [ quot seq parse-value ] map concat ] }

        { [ CHAR: / over member? ] [
            "/" split1 [
                quot seq parse-value dup length 1 =
                [ seq swap first seq index seq length ]
                [ 0 over length ] if 1 -
            ] dip string>number <range> swap nths ] }

        { [ CHAR: - over member? ] [
            "-" split1 quot seq parse-range [a..b] ] }

        { [ CHAR: ~ over member? ] [
            "~" split1 quot seq parse-range [a..b] random 1array ] }

        [ quot call 1array ]
    } cond members sort ; inline recursive

We can then make parse-cronentry to parse the entry description, handling days and months differently to allow their abbreviations to be passed as input (e.g., sun for Sunday or jan for January).

: parse-day ( str -- n )
    [ string>number dup 7 = [ drop 0 ] when ] [
        >lower $[ day-abbreviations3 [ >lower ] map ] index
    ] ?unless ;

: parse-month ( str -- n )
    [ string>number ] [
        >lower $[ month-abbreviations [ >lower ] map ] index
    ] ?unless ;

: parse-cronentry ( entry -- cronentry )
    " " split1 " " split1 " " split1 " " split1 " " split1 {
        [ [ string>number ] T{ range f 0 60 1 } parse-value ]
        [ [ string>number ] T{ range f 0 24 1 } parse-value ]
        [ [ string>number ] T{ range f 1 31 1 } parse-value ]
        [ [ parse-month ] T{ range f 1 12 1 } parse-value ]
        [ [ parse-day ] T{ circular f T{ range f 0 7 1 } 1 } parse-value ]
        [ ]
    } spread cronentry boa ;

We can try using it to see what a parsed cronentry looks like:

IN: scratchpad "20-30/5 5 */5 * * /path/to/command" parse-cronentry .
T{ cronentry
    { minutes { 20 25 30 } }
    { hours { 5 } }
    { days { 1 6 11 16 21 26 31 } }
    { months { 1 2 3 4 5 6 7 8 9 10 11 12 } }
    { days-of-week { 0 1 2 3 4 5 6 } }
    { command "/path/to/command" }
}

Now that we have that working, we can use it to calculate the next-time-after a given timestamp that the cronentry will trigger, applying a waterfall to rollover the timestamp until a valid one is found:

:: (next-time-after) ( cronentry timestamp -- )

    f ! should we keep searching for a matching time

    timestamp month>> :> month
    cronentry months>> [ month >= ] find nip
    dup month = [ drop ] [
        [ cronentry months>> first timestamp 1 +year drop ] unless*
        timestamp 1 >>day 0 >>hour 0 >>minute month<< drop t
    ] if

    timestamp day-of-week :> weekday
    cronentry days-of-week>> [ weekday >= ] find nip [
        cronentry days-of-week>> first 7 +
    ] unless* weekday - :> days-to-weekday

    timestamp day>> :> day
    cronentry days>> [ day >= ] find nip [
        cronentry days>> first timestamp days-in-month +
    ] unless* day - :> days-to-day

    cronentry days-of-week>> length 7 =
    cronentry days>> length 31 = 2array
    {
        { { f t } [ days-to-weekday ] }
        { { t f } [ days-to-day ] }
        [ drop days-to-weekday days-to-day min ]
    } case [
        timestamp 0 >>hour 0 >>minute swap +day 2drop t
    ] unless-zero

    timestamp hour>> :> hour
    cronentry hours>> [ hour >= ] find nip
    dup hour = [ drop ] [
        [ cronentry hours>> first timestamp 1 +day drop ] unless*
        timestamp 0 >>minute hour<< drop t
    ] if

    timestamp minute>> :> minute
    cronentry minutes>> [ minute >= ] find nip
    dup minute = [ drop ] [
        [ cronentry minutes>> first timestamp 1 +hour drop ] unless*
        timestamp minute<< drop t
    ] if

    [ cronentry timestamp (next-time-after) ] when ;

: next-time-after ( cronentry timestamp -- timestamp )
    [ dup cronentry? [ parse-cronentry ] unless ]
    [ 1 minutes time+ 0 >>second ] bi*
    [ (next-time-after) ] keep ;

This is great, because we can find the next time that a cronentry will trigger. For example, if we wanted to specify something to trigger at midnight on every leap day:

IN: scratchpad "0 0 29 2 *" now next-time-after timestamp>rfc822 .
"Thu, 29 Feb 2024 00:00:00 -0800"

Or even, the next several times that the cronentry will trigger:

IN: scratchpad "0 0 29 2 *" now 5 [
                   dupd next-time-after [ timestamp>rfc822 . ] keep
               ] times 2drop
"Thu, 29 Feb 2024 00:00:00 -0800"
"Tue, 29 Feb 2028 00:00:00 -0800"
"Sun, 29 Feb 2032 00:00:00 -0800"
"Fri, 29 Feb 2036 00:00:00 -0800"
"Wed, 29 Feb 2040 00:00:00 -0800"

This is available in the crontab vocabulary including some features such as support for aliases (e.g., @daily and @weekly) and some higher-level words for working with crontabs and cronentries.