Anubis
Thursday, August 21, 2025
Tavis Ormany wrote a great blog post about the Anubis project asking a very valid sounding question:
Hey… quick question, why are anime catgirls blocking my access to the Linux kernel?
The answer seems to be that it “weighs the soul of your connection using one or more challenges in order to protect upstream resources from scraper bots”. In particular, this project is an attempt to fight the AI scraperbot scourge which is making many popular websites annoying to use these days and spawning a kind of arms race amongst website owners, content delivery networks, and well-funded and morally-ambiguous AI firms.
Tavis goes into great detail about the estimated costs and inconvenience of this approach — and why it might likely inconvenience scraper bots which use many different IP addresses more than normal traffic which typically does not — as well as how the proof-of-work methodology is implemented using a solution written in the C programming language.
Without going into the safety versus security debate (a focus of the discussion on Lobsters), I thought I would show how to implement this using Factor.
How does Anubis work?
The Anubis challenge is a message, for example 5d737f0600ff2dd
, which we
use as a prefix while trying up to 262144
different nonce suffixes to
find a SHA-256 hash that starts with 16 bits of zero.
The Anubis response in this case is 47224
, which has a SHA-256 hash
starting with 0000
:
$ printf "5d737f0600ff2dd%d" 47224 | sha256sum
000043f7c4392a781a04419a7cb503089ebcf3164e2b1d4258b3e6c15b8b07f1 -
Solving in Factor
Factor includes support for SHA-2 checksums that we can use to solve the example puzzle:
USING: checksums checksums.sha kernel math math.parser sequences ;
: find-anubis ( message -- nonce anubis )
18 2^ <iota> [
>dec append sha-256 checksum-bytes
[ B{ 0 0 } head? ] keep f ?
] with map-find swap ;
And a test showing that it works:
{
47224
"000043f7c4392a781a04419a7cb503089ebcf3164e2b1d4258b3e6c15b8b07f1"
} [ "5d737f0600ff2dd" find-anubis bytes>hex-string ] unit-test
But, I noticed that it’s not that fast:
IN: scratchpad [ "5d737f0600ff2dd" find-anubis ] time
Running time: 0.216027939 seconds
Solving in Factor using C
Tanis used some functions from the OpenSSL library to compute the checksum.
We can take the same approach using the C library interface. It would be great to be able to parse header files and make this a little simpler, but for now we can define these C functions that we would like to call:
LIBRARY: libcrypto
STRUCT: SHA256_CTX
{ h uint[8] }
{ Nl uint }
{ Nh uint }
{ data uint[16] }
{ num uint }
{ md_len uint } ;
FUNCTION: int SHA256_Init ( SHA256_CTX* c )
FUNCTION: int SHA256_Update ( SHA256_CTX* c, void* data, size_t len )
FUNCTION: int SHA256_Final ( uchar* md, SHA256_CTX* c )
And then build the same C program in Factor:
:: find-anubis ( message -- nonce anubis )
SHA256_CTX new :> base
base SHA256_Init 1 assert=
base message binary encode dup length SHA256_Update 1 assert=
32 <byte-array> :> hash
18 2^ <iota> [
base clone :> ctx
ctx swap >dec binary encode dup length SHA256_Update 1 assert=
hash ctx SHA256_Final 1 assert=
hash B{ 0 0 } head?
] find nip hash ;
Is it fast?
IN: scratchpad [ "5d737f0600ff2dd" find-anubis ] time
Running time: 0.009508132 seconds
Sure is!
How does that compare to the original C program?
$ gcc -Ofast -march=native anubis-miner.c -lcrypto -o anubis-miner
$ time ./anubis-miner 5d737f0600ff2dd
47224
real 0m0.005s
user 0m0.003s
sys 0m0.002s
Pretty favorably!
Solving in Factor using C approach
Part of the reason the C approach is fast, is that it hashes the message and
then only has to hash the additional bytes of the nonce and check the result
passes. We can try this, by cloning our sha-256-state
and checking on each
iteration whether it passes the test:
USING: checksums checksums.sha kernel math math.parser sequences ;
: find-anubis ( message -- nonce anubis )
sha-256 initialize-checksum-state swap add-checksum-bytes
18 2^ <iota> [
[ clone ] dip >dec add-checksum-bytes
get-checksum [ B{ 0 0 } head? ] keep f ?
] with map-find swap ;
Is that faster?
IN: scratchpad [ "5d737f0600ff2dd" find-anubis ] time
Running time: 0.183254613 seconds
A little bit. But what’s the problem?
IN: scratchpad [ "5d737f0600ff2dd" find-anubis ] profile
IN: scratchpad top-down profile.
depth time ms GC % JIT % FFI % FT %
13 183.0 0.00 0.00 13.11 0.00 T{ thread f "Initial" ~quotation~ ~quotation~ 19 ~box~ f t f H{...
14 149.0 0.00 0.00 8.72 0.00 M\ sha-256-state get-checksum
15 85.0 0.00 0.00 2.35 0.00 M\ sha2-short checksum-block
16 27.0 0.00 0.00 7.41 0.00 4be>
17 10.0 0.00 0.00 0.00 0.00 M\ virtual-sequence nth-unsafe
18 8.0 0.00 0.00 0.00 0.00 M\ slice virtual@
19 6.0 0.00 0.00 0.00 0.00 +
17 6.0 0.00 0.00 0.00 0.00 M\ fixnum shift
17 4.0 0.00 0.00 50.00 0.00 fixnum-shift
17 4.0 0.00 0.00 0.00 0.00 M\ byte-array nth-unsafe
17 1.0 0.00 0.00 0.00 0.00 bitor
17 1.0 0.00 0.00 0.00 0.00 >
17 1.0 0.00 0.00 0.00 0.00 M\ slice length
16 7.0 0.00 0.00 0.00 0.00 M\ fixnum integer>fixnum
16 5.0 0.00 0.00 0.00 0.00 M\ fixnum >fixnum
16 4.0 0.00 0.00 0.00 0.00 be>
17 2.0 0.00 0.00 0.00 0.00 M\ slice length
16 2.0 0.00 0.00 0.00 0.00 (byte-array)
16 2.0 0.00 0.00 0.00 0.00 M\ slice length
16 1.0 0.00 0.00 0.00 0.00 c-ptr?
16 1.0 0.00 0.00 0.00 0.00 M\ fixnum integer>fixnum-strict
15 35.0 0.00 0.00 25.71 0.00 pad-last-block
16 26.0 0.00 0.00 26.92 0.00 %
17 7.0 0.00 0.00 85.71 0.00 set-alien-unsigned-1
17 7.0 0.00 0.00 0.00 0.00 M\ growable nth-unsafe
18 1.0 0.00 0.00 0.00 0.00 M\ byte-vector underlying>>
17 4.0 0.00 0.00 0.00 0.00 M\ growable set-nth-unsafe
18 1.0 0.00 0.00 0.00 0.00 M\ byte-vector underlying>>
17 2.0 0.00 0.00 50.00 0.00 resize-byte-array
17 1.0 0.00 0.00 0.00 0.00 M\ growable lengthen
18 1.0 0.00 0.00 0.00 0.00 M\ byte-vector length>>
17 1.0 0.00 0.00 0.00 0.00 M\ growable length
17 1.0 0.00 0.00 0.00 0.00 M\ byte-array set-nth-unsafe
17 1.0 0.00 0.00 0.00 0.00 integer?
16 5.0 0.00 0.00 0.00 0.00 >slow-be
17 2.0 0.00 0.00 0.00 0.00 M\ fixnum shift
17 1.0 0.00 0.00 0.00 0.00 fixnum-shift
17 1.0 0.00 0.00 0.00 0.00 M\ fixnum integer>fixnum
16 1.0 0.00 0.00 100.00 0.00 <byte-array>
16 1.0 0.00 0.00 100.00 0.00 set-alien-unsigned-1
16 1.0 0.00 0.00 0.00 0.00 M\ growable set-nth
17 1.0 0.00 0.00 0.00 0.00 M\ byte-vector underlying>>
16 1.0 0.00 0.00 0.00 0.00 ,
17 1.0 0.00 0.00 0.00 0.00 assoc-stack
18 1.0 0.00 0.00 0.00 0.00 M\ hashtable at*
15 16.0 0.00 0.00 6.25 0.00 >slow-be
16 8.0 0.00 0.00 0.00 0.00 M\ fixnum shift
16 2.0 0.00 0.00 50.00 0.00 (byte-array)
16 1.0 0.00 0.00 0.00 0.00 fixnum-shift
16 1.0 0.00 0.00 0.00 0.00 <
16 1.0 0.00 0.00 0.00 0.00 M\ fixnum integer>fixnum
15 3.0 0.00 0.00 0.00 0.00 M\ chunking nth-unsafe
16 2.0 0.00 0.00 0.00 0.00 M\ groups group@
17 1.0 0.00 0.00 0.00 0.00 M\ fixnum min
17 1.0 0.00 0.00 0.00 0.00 *
15 2.0 0.00 0.00 0.00 0.00 >be
15 1.0 0.00 0.00 0.00 0.00 M\ sha2-state H>>
15 1.0 0.00 0.00 100.00 0.00 fixnum/i
15 1.0 0.00 0.00 0.00 0.00 M\ sha2-state clone
16 1.0 0.00 0.00 0.00 0.00 M\ sha2-state H<<
15 1.0 0.00 0.00 0.00 0.00 M\ uint-array nth-unsafe
14 23.0 0.00 0.00 30.43 0.00 M\ block-checksum-state add-checksum-bytes
15 18.0 0.00 0.00 27.78 0.00 >byte-vector
16 7.0 0.00 0.00 0.00 0.00 M\ virtual-sequence nth-unsafe
17 1.0 0.00 0.00 0.00 0.00 M\ slice virtual@
18 1.0 0.00 0.00 0.00 0.00 +
16 5.0 0.00 0.00 80.00 0.00 set-alien-unsigned-1
16 2.0 0.00 0.00 0.00 0.00 M\ byte-array nth-unsafe
16 2.0 0.00 0.00 0.00 0.00 M\ growable nth-unsafe
17 2.0 0.00 0.00 0.00 0.00 M\ byte-vector underlying>>
16 1.0 0.00 0.00 100.00 0.00 (byte-array)
16 1.0 0.00 0.00 0.00 0.00 M\ slice length
15 2.0 0.00 0.00 100.00 0.00 fixnum/i
15 1.0 0.00 0.00 0.00 0.00 M\ string nth-unsafe
15 1.0 0.00 0.00 0.00 0.00 >
15 1.0 0.00 0.00 0.00 0.00 number=
14 4.0 0.00 0.00 0.00 0.00 head?
15 1.0 0.00 0.00 0.00 0.00 M\ byte-array length
15 1.0 0.00 0.00 0.00 0.00 >
15 1.0 0.00 0.00 0.00 0.00 M\ byte-array nth-unsafe
15 1.0 0.00 0.00 0.00 0.00 integer?
14 3.0 0.00 0.00 66.67 0.00 M\ fixnum positive>dec
15 2.0 0.00 0.00 100.00 0.00 <string>
14 2.0 0.00 0.00 100.00 0.00 M\ sha2-state clone
15 1.0 0.00 0.00 100.00 0.00 M\ checksum-state clone
16 1.0 0.00 0.00 100.00 0.00 (clone)
15 1.0 0.00 0.00 100.00 0.00 M\ uint-array clone
16 1.0 0.00 0.00 100.00 0.00 (clone)
14 1.0 0.00 0.00 0.00 0.00 M\ integer >base
14 1.0 0.00 0.00 0.00 0.00 reverse!
Visualizing the profile using the flamegraph vocabulary allows us to dig a little bit further:
Looks like a lot of generic dispatch, inefficient byte swapping, memory allocations, and type conversions. Probably this could be made much faster by looking into how we handle block checksums.
PRs welcome!