Re: Factor

Factor: the language, the theory, and the practice.

RDAP

Tuesday, March 18, 2025

#networking

ICANN recently posted an update on Launching RDAP; Sunsetting WHOIS which got some discussion on Hacker News. Andy Newton, one of the creators of RDAP, has published A Guide to the Registration Data Access Protocol (RDAP), which is a pretty useful resource for understanding how it works. More information comes from the main RDAP website which describes it as:

The Registration Data Access Protocol (RDAP) is the successor to WHOIS. Like WHOIS, RDAP provides access to information about Internet resources (domain names, autonomous system numbers, and IP addresses). Unlike WHOIS, RDAP provides:

  • A machine-readable representation of registration data;
  • Differentiated access;
  • Structured request and response semantics;
  • Internationalisation;
  • Extensibility.

The WHOIS protocol that it replaces is super simple, being described by RFC 3912 in a few paragraphs. And, in fact, you can test it out on the command-line of most computers:

$ echo -e "factorcode.org\r\n" | nc -i 1 whois.cloudflare.com 43
Domain Name: FACTORCODE.ORG
Registry Domain ID: c49c93dee3304f39b081383262d320c6-LROR
Registrar WHOIS Server: whois.cloudflare.com
Registrar URL: https://www.cloudflare.com
Updated Date: 2025-01-15T22:46:54Z
Creation Date: 2005-12-01T04:54:37Z
Registrar Registration Expiration Date: 2025-12-01T04:54:37Z
Registrar: Cloudflare, Inc.
Registrar IANA ID: 1910
Domain Status: clienttransferprohibited https://icann.org/epp#clienttransferprohibited
...

The RDAP protocol must be equally simple, right? Well, not so fast. Instead of a few paragraphs, and simple queries over sockets, you get many RFCs describing it:

These, along with things like the RDAP Extension registry, and the protocol reliance on HTTP/HTTPS, JSON, and JSONPath considerably increase the complexity of RDAP implementations.

Below we are going to start an implementation using Factor.

Bootstrapping

The first concept we have to implement is RDAP Bootstrapping which uses 5 IANA files to redirect searches to the correct upstream RDAP servers.

Type Link
Forward DNS https://data.iana.org/rdap/dns.json
IPv4 Addresses https://data.iana.org/rdap/ipv4.json
IPv6 Addresses https://data.iana.org/rdap/ipv6.json
Autonomous System Numbers https://data.iana.org/rdap/asn.json
Object Tags https://data.iana.org/rdap/object-tags.json

We can abstract these by making a word to convert a type to a URL:

: bootstrap-url ( type -- url )
    "https://data.iana.org/rdap/" ".json" surround ;

We don’t want to retrieve these files all the time, so let’s cache them for 30 days:

INITIALIZED-SYMBOL: bootstrap-cache [ 30 days ]

: bootstrap-get ( type -- data )
    bootstrap-url cache-directory bootstrap-cache get
    download-outdated-into path>json ;

And provide a way to force delete the cached bootstrap files:

CONSTANT: bootstrap-files { "asn" "dns" "ipv4" "ipv6" "object-tags" }

: reset-bootstrap ( -- )
    [ bootstrap-files [ ".json" append ?delete-file ] each ] with-cache-directory ;

Each bootstrap file is described in RFC 9224, but basically we want to extract and manipulate the "services" block, modifying the keys of the assoc for convenient searching:

: parse-services ( data quot: ( key -- key' ) -- services )
    [ "services" of ] dip '[ [ _ map ] dip ] assoc-map ; inline

: search-services ( services quot: ( key -- ? ) -- urls )
    '[ drop _ any? ] assoc-find drop nip ; inline

We can then provide bootstrap data structures that are used for searching. For example, we find the longest subdomain that has an entry in the dns bootstrap list to handle both SLD and TLD:

: dns-bootstrap ( -- services )
    "dns" bootstrap-get "services" of ;

: split-domain ( domain -- domains )
    "." split dup length <iota> [ tail "." join ] with map ;

: domain-endpoints ( domain -- urls )
    split-domain dns-bootstrap [ swap member? ] with search-services ;

You can see that different domain names are directed to different RDAP endpoints:

IN: scratchpad "factorcode.org" domain-endpoints .
{ "https://rdap.publicinterestregistry.org/rdap/" }

IN: scratchpad "google.com" domain-endpoints .
{ "https://rdap.verisign.com/com/v1/" }

Or, find the correct endpoint for a given IPV4 address from the ipv4 bootstrap list:

: ipv4-bootstrap ( -- services )
    "ipv4" bootstrap-get [ >ipv4-network ] parse-services ;

: ipv4-endpoints ( ipv4 -- urls )
    ipv4-aton ipv4-bootstrap [ ipv4-contains? ] with search-services ;

Lookup

The RDAP data is typically available from HTTP or HTTPS web servers, as JSON files, but it uses a custom mime-type application/rdap+json. We can write a simple word to make the request and convert the response:

: accept-rdap ( request -- request )
    "application/rdap+json" "Accept" set-header ;

: rdap-get ( url -- response rdap )
    <get-request> accept-rdap http-request
    dup string? [ utf8 decode ] unless json> ;

And now we can build a word to lookup a domain:

: lookup-domain ( domain -- results )
    [ domain-endpoints random ]
    [ "domain/%s" sprintf derive-url rdap-get nip ] bi ;

Or to lookup an IPV4 address:

: lookup-ipv4 ( ipv4 -- results )
    [ ipv4-endpoints random ]
    [ "ip/%s" sprintf derive-url rdap-get nip ] bi ;

And we can try it out, getting an extensive response:

IN: scratchpad "factorcode.org" lookup-domain

--- Data stack:
LH{ { "rdapConformance" ~array~ } { "notices" ~array~ } { ...

Output

It would be nice to print the output in a more human-readable format. For now, we will just print these as a nested tree of keys and values:

GENERIC: print-rdap-nested ( padding key value -- )

M: linked-assoc print-rdap-nested
    [ over write write ":" print "  " append ] dip
    [ swapd print-rdap-nested ] with assoc-each ;

M: array print-rdap-nested
    [ print-rdap-nested ] 2with each ;

M: object print-rdap-nested
    present [ 2drop ] [ [ ": " [ write ] tri@ ] dip print ] if-empty ;

: print-rdap ( results -- )
    [ "" -rot print-rdap-nested ] assoc-each ;

This could probably be improved a fair bit – for example, the keys could be made more readable, and it doesn’t handle vCard entries very well.

Try it out!

You can try this out, by lookup up a domain name:

IN: scratchpad "factorcode.org" lookup-domain print-rdap
rdapConformance: rdap_level_0
rdapConformance: icann_rdap_response_profile_0
rdapConformance: icann_rdap_technical_implementation_guide_0
ldhName: factorcode.org
unicodeName: factorcode.org
nameservers:
  ldhName: carl.ns.cloudflare.com
  unicodeName: carl.ns.cloudflare.com
  objectClassName: nameserver
  handle: c34bedeccd8e4514b917e9e82a052077-LROR
  status: associated
nameservers:
  ldhName: kay.ns.cloudflare.com
  unicodeName: kay.ns.cloudflare.com
  objectClassName: nameserver
  handle: 7fc12bf413944de088f27f837349a8da-LROR
  status: associated
...

Or, by lookup up an IP address:

IN: scratchpad "1.1.1.1" lookup-ipv4 print-rdap
rdapConformance: history_version_0
rdapConformance: nro_rdap_profile_0
rdapConformance: cidr0
rdapConformance: rdap_level_0
events:
  eventAction: registration
  eventDate: 2011-08-10T23:12:35Z
events:
  eventAction: last changed
  eventDate: 2023-04-26T22:57:58Z
name: APNIC-LABS
status: active
type: ASSIGNED PORTABLE
endAddress: 1.1.1.255
ipVersion: v4
startAddress: 1.1.1.0
objectClassName: ip network
handle: 1.1.1.0 - 1.1.1.255
...

Pretty cool!

This was recently committed to the development version of Factor.