Parsing IPv4 Addresses
Thursday, October 18, 2012
I came across an interesting question on StackOverflow asking about
parsing IPv4
addresses.
Typically, IPv4 addresses are specified with four components (e.g.,
something like 192.164.1.55
). As the top answer points out, you might
be suprised to see that ping
interprets addresses a bit oddly:
C:\>ping 1
Pinging 0.0.0.1 with 32 bytes of data:
C:\>ping 1.2
Pinging 1.0.0.2 with 32 bytes of data:
C:\>ping 1.2.3
Pinging 1.2.0.3 with 32 bytes of data:
C:\>ping 1.2.3.4
Pinging 1.2.3.4 with 32 bytes of data:
C:\>ping 1.2.3.4.5
Ping request could not find host 1.2.3.4.5. Please check the name and try again.
C:\>ping 255
Pinging 0.0.0.255 with 32 bytes of data:
C:\>ping 256
Pinging 0.0.1.0 with 32 bytes of data:
In fact, you can reach google.com
, using the IP address specified as
dotted decimal (74.125.226.4
), flat decimal (1249763844
), dotted
octal (0112.0175.0342.0004
), flat octal (011237361004
), dotted hex
(0x4A.0x7D.0xE2.0x04
), flat hex (0x4A7DE204
), or even something of
each (74.0175.0xe2.4
).
Implementation
Of course, my first thought was that Factor
should have a parser that works similarly (especially since I
implemented support for the ping
protocol awhile
ago). We want a parse-ipv4
word taking a string representing the
address and returning an IPv4 address string that has the typical four
components.
First, we need to have words to split a string into numbered parts and a word to join them back together:
: split-components ( str -- array )
"." split [ string>number ] map ;
: join-components ( array -- str )
[ number>string ] map "." join ;
Then, we can parse the address simply:
: parse-ipv4 ( str -- ip )
split-components dup length {
{ 1 [ { 0 0 0 } prepend ] }
{ 2 [ first2 [| A D | { A 0 0 D } ] call ] }
{ 3 [ first3 [| A B D | { A B 0 D } ] call ] }
{ 4 [ ] }
} case join-components ;
Extras
If we want to support octal addresses, we can convert an octal number
like 0112
to something Factor can easily parse (0o112
) in our
splitting code:
: cleanup-octal ( str -- str )
dup { [ "0" head? ] [ "0x" head? not ] } 1&&
[ 1 tail "0o" prepend ] when ;
: split-components ( str -- array )
"." split [ cleanup-octal string>number ] map ;
And if we want to support the “carry propagation” which allows 256
to
mean 0.0.1.0
, we need to “bubble” the array before joining:
: bubble ( array -- array' )
reverse 0 swap [ + 256 /mod ] map reverse nip ;
: join-components ( array -- str )
bubble [ number>string ] map "." join ;
This (along with some error handling) has been
committed
to the Factor repository in the ip-parser
vocabulary. If it proves
useful, it might be nice to change the io.sockets
to use this when
resolving IPv4 addresses…