Re: Factor

Factor: the language, the theory, and the practice.

Anubis

Thursday, August 21, 2025

Tavis Ormany wrote a great blog post about the Anubis project asking a very valid sounding question:

Hey… quick question, why are anime catgirls blocking my access to the Linux kernel?

The answer seems to be that it “weighs the soul of your connection using one or more challenges in order to protect upstream resources from scraper bots”. In particular, this project is an attempt to fight the AI scraperbot scourge which is making many popular websites annoying to use these days and spawning a kind of arms race amongst website owners, content delivery networks, and well-funded and morally-ambiguous AI firms.

Tavis goes into great detail about the estimated costs and inconvenience of this approach — and why it might likely inconvenience scraper bots which use many different IP addresses more than normal traffic which typically does not — as well as how the proof-of-work methodology is implemented using a solution written in the C programming language.

Without going into the safety versus security debate (a focus of the discussion on Lobsters), I thought I would show how to implement this using Factor.

How does Anubis work?

The Anubis challenge is a message, for example 5d737f0600ff2dd, which we use as a prefix while trying up to 262144 different nonce suffixes to find a SHA-256 hash that starts with 16 bits of zero.

The Anubis response in this case is 47224, which has a SHA-256 hash starting with 0000:

$ printf "5d737f0600ff2dd%d" 47224 | sha256sum
000043f7c4392a781a04419a7cb503089ebcf3164e2b1d4258b3e6c15b8b07f1  -

Solving in Factor

Factor includes support for SHA-2 checksums that we can use to solve the example puzzle:

USING: checksums checksums.sha kernel math math.parser sequences ;

: find-anubis ( message -- nonce anubis )
    18 2^ <iota> [
        >dec append sha-256 checksum-bytes
        [ B{ 0 0 } head? ] keep f ?
    ] with map-find swap ;

And a test showing that it works:

{
    47224
    "000043f7c4392a781a04419a7cb503089ebcf3164e2b1d4258b3e6c15b8b07f1"
} [ "5d737f0600ff2dd" find-anubis bytes>hex-string ] unit-test

But, I noticed that it’s not that fast:

IN: scratchpad [ "5d737f0600ff2dd" find-anubis ] time
Running time: 0.216027939 seconds

Solving in Factor using C

Tanis used some functions from the OpenSSL library to compute the checksum.

We can take the same approach using the C library interface. It would be great to be able to parse header files and make this a little simpler, but for now we can define these C functions that we would like to call:

LIBRARY: libcrypto

STRUCT: SHA256_CTX
    { h uint[8] }
    { Nl uint }
    { Nh uint }
    { data uint[16] }
    { num uint }
    { md_len uint } ;

FUNCTION: int SHA256_Init ( SHA256_CTX* c )
FUNCTION: int SHA256_Update ( SHA256_CTX* c, void* data, size_t len )
FUNCTION: int SHA256_Final ( uchar* md, SHA256_CTX* c )

And then build the same C program in Factor:

:: find-anubis ( message -- nonce anubis )
    SHA256_CTX new :> base
    base SHA256_Init 1 assert=
    base message binary encode dup length SHA256_Update 1 assert=

    32 <byte-array> :> hash
    18 2^ <iota> [
        base clone :> ctx
        ctx swap >dec binary encode dup length SHA256_Update 1 assert=
        hash ctx SHA256_Final 1 assert=
        hash B{ 0 0 } head?
    ] find nip hash ;

Is it fast?

IN: scratchpad [ "5d737f0600ff2dd" find-anubis ] time
Running time: 0.009508132 seconds

Sure is!

How does that compare to the original C program?

$ gcc -Ofast -march=native anubis-miner.c -lcrypto -o anubis-miner
$ time ./anubis-miner 5d737f0600ff2dd
47224

real    0m0.005s
user    0m0.003s
sys     0m0.002s

Pretty favorably!

Solving in Factor using C approach

Part of the reason the C approach is fast, is that it hashes the message and then only has to hash the additional bytes of the nonce and check the result passes. We can try this, by cloning our sha-256-state and checking on each iteration whether it passes the test:

USING: checksums checksums.sha kernel math math.parser sequences ;

: find-anubis ( message -- nonce anubis )
    sha-256 initialize-checksum-state swap add-checksum-bytes
    18 2^ <iota> [
        [ clone ] dip >dec add-checksum-bytes
        get-checksum [ B{ 0 0 } head? ] keep f ?
    ] with map-find swap ;

Is that faster?

IN: scratchpad [ "5d737f0600ff2dd" find-anubis ] time
Running time: 0.183254613 seconds

A little bit. But what’s the problem?

IN: scratchpad [ "5d737f0600ff2dd" find-anubis ] profile

IN: scratchpad top-down profile.
depth   time ms  GC %  JIT %  FFI %   FT %
  13     183.0   0.00   0.00  13.11   0.00 T{ thread f "Initial" ~quotation~ ~quotation~ 19 ~box~ f t f H{...
  14     149.0   0.00   0.00   8.72   0.00   M\ sha-256-state get-checksum
  15      85.0   0.00   0.00   2.35   0.00     M\ sha2-short checksum-block
  16      27.0   0.00   0.00   7.41   0.00       4be>
  17      10.0   0.00   0.00   0.00   0.00         M\ virtual-sequence nth-unsafe
  18       8.0   0.00   0.00   0.00   0.00           M\ slice virtual@
  19       6.0   0.00   0.00   0.00   0.00             +
  17       6.0   0.00   0.00   0.00   0.00         M\ fixnum shift
  17       4.0   0.00   0.00  50.00   0.00         fixnum-shift
  17       4.0   0.00   0.00   0.00   0.00         M\ byte-array nth-unsafe
  17       1.0   0.00   0.00   0.00   0.00         bitor
  17       1.0   0.00   0.00   0.00   0.00         >
  17       1.0   0.00   0.00   0.00   0.00         M\ slice length
  16       7.0   0.00   0.00   0.00   0.00       M\ fixnum integer>fixnum
  16       5.0   0.00   0.00   0.00   0.00       M\ fixnum >fixnum
  16       4.0   0.00   0.00   0.00   0.00       be>
  17       2.0   0.00   0.00   0.00   0.00         M\ slice length
  16       2.0   0.00   0.00   0.00   0.00       (byte-array)
  16       2.0   0.00   0.00   0.00   0.00       M\ slice length
  16       1.0   0.00   0.00   0.00   0.00       c-ptr?
  16       1.0   0.00   0.00   0.00   0.00       M\ fixnum integer>fixnum-strict
  15      35.0   0.00   0.00  25.71   0.00     pad-last-block
  16      26.0   0.00   0.00  26.92   0.00       %
  17       7.0   0.00   0.00  85.71   0.00         set-alien-unsigned-1
  17       7.0   0.00   0.00   0.00   0.00         M\ growable nth-unsafe
  18       1.0   0.00   0.00   0.00   0.00           M\ byte-vector underlying>>
  17       4.0   0.00   0.00   0.00   0.00         M\ growable set-nth-unsafe
  18       1.0   0.00   0.00   0.00   0.00           M\ byte-vector underlying>>
  17       2.0   0.00   0.00  50.00   0.00         resize-byte-array
  17       1.0   0.00   0.00   0.00   0.00         M\ growable lengthen
  18       1.0   0.00   0.00   0.00   0.00           M\ byte-vector length>>
  17       1.0   0.00   0.00   0.00   0.00         M\ growable length
  17       1.0   0.00   0.00   0.00   0.00         M\ byte-array set-nth-unsafe
  17       1.0   0.00   0.00   0.00   0.00         integer?
  16       5.0   0.00   0.00   0.00   0.00       >slow-be
  17       2.0   0.00   0.00   0.00   0.00         M\ fixnum shift
  17       1.0   0.00   0.00   0.00   0.00         fixnum-shift
  17       1.0   0.00   0.00   0.00   0.00         M\ fixnum integer>fixnum
  16       1.0   0.00   0.00 100.00   0.00       <byte-array>
  16       1.0   0.00   0.00 100.00   0.00       set-alien-unsigned-1
  16       1.0   0.00   0.00   0.00   0.00       M\ growable set-nth
  17       1.0   0.00   0.00   0.00   0.00         M\ byte-vector underlying>>
  16       1.0   0.00   0.00   0.00   0.00       ,
  17       1.0   0.00   0.00   0.00   0.00         assoc-stack
  18       1.0   0.00   0.00   0.00   0.00           M\ hashtable at*
  15      16.0   0.00   0.00   6.25   0.00     >slow-be
  16       8.0   0.00   0.00   0.00   0.00       M\ fixnum shift
  16       2.0   0.00   0.00  50.00   0.00       (byte-array)
  16       1.0   0.00   0.00   0.00   0.00       fixnum-shift
  16       1.0   0.00   0.00   0.00   0.00       <
  16       1.0   0.00   0.00   0.00   0.00       M\ fixnum integer>fixnum
  15       3.0   0.00   0.00   0.00   0.00     M\ chunking nth-unsafe
  16       2.0   0.00   0.00   0.00   0.00       M\ groups group@
  17       1.0   0.00   0.00   0.00   0.00         M\ fixnum min
  17       1.0   0.00   0.00   0.00   0.00         *
  15       2.0   0.00   0.00   0.00   0.00     >be
  15       1.0   0.00   0.00   0.00   0.00     M\ sha2-state H>>
  15       1.0   0.00   0.00 100.00   0.00     fixnum/i
  15       1.0   0.00   0.00   0.00   0.00     M\ sha2-state clone
  16       1.0   0.00   0.00   0.00   0.00       M\ sha2-state H<<
  15       1.0   0.00   0.00   0.00   0.00     M\ uint-array nth-unsafe
  14      23.0   0.00   0.00  30.43   0.00   M\ block-checksum-state add-checksum-bytes
  15      18.0   0.00   0.00  27.78   0.00     >byte-vector
  16       7.0   0.00   0.00   0.00   0.00       M\ virtual-sequence nth-unsafe
  17       1.0   0.00   0.00   0.00   0.00         M\ slice virtual@
  18       1.0   0.00   0.00   0.00   0.00           +
  16       5.0   0.00   0.00  80.00   0.00       set-alien-unsigned-1
  16       2.0   0.00   0.00   0.00   0.00       M\ byte-array nth-unsafe
  16       2.0   0.00   0.00   0.00   0.00       M\ growable nth-unsafe
  17       2.0   0.00   0.00   0.00   0.00         M\ byte-vector underlying>>
  16       1.0   0.00   0.00 100.00   0.00       (byte-array)
  16       1.0   0.00   0.00   0.00   0.00       M\ slice length
  15       2.0   0.00   0.00 100.00   0.00     fixnum/i
  15       1.0   0.00   0.00   0.00   0.00     M\ string nth-unsafe
  15       1.0   0.00   0.00   0.00   0.00     >
  15       1.0   0.00   0.00   0.00   0.00     number=
  14       4.0   0.00   0.00   0.00   0.00   head?
  15       1.0   0.00   0.00   0.00   0.00     M\ byte-array length
  15       1.0   0.00   0.00   0.00   0.00     >
  15       1.0   0.00   0.00   0.00   0.00     M\ byte-array nth-unsafe
  15       1.0   0.00   0.00   0.00   0.00     integer?
  14       3.0   0.00   0.00  66.67   0.00   M\ fixnum positive>dec
  15       2.0   0.00   0.00 100.00   0.00     <string>
  14       2.0   0.00   0.00 100.00   0.00   M\ sha2-state clone
  15       1.0   0.00   0.00 100.00   0.00     M\ checksum-state clone
  16       1.0   0.00   0.00 100.00   0.00       (clone)
  15       1.0   0.00   0.00 100.00   0.00     M\ uint-array clone
  16       1.0   0.00   0.00 100.00   0.00       (clone)
  14       1.0   0.00   0.00   0.00   0.00   M\ integer >base
  14       1.0   0.00   0.00   0.00   0.00   reverse!

Visualizing the profile using the flamegraph vocabulary allows us to dig a little bit further:

Looks like a lot of generic dispatch, inefficient byte swapping, memory allocations, and type conversions. Probably this could be made much faster by looking into how we handle block checksums.

PRs welcome!