Humanhash
Thursday, December 5, 2013
Zachary Voase published the humanhash project on GitHub, making “human-readable representations of digests”. Below is a compatible implementation in Factor.
To show how it will work, we use the example from his humanhash README:
IN: scratchpad CONSTANT: digest "7528880a986c40e78c38115e640da2a1"
IN: scratchpad digest humanhash .
"three-georgia-xray-jig"
IN: scratchpad digest 6 humanhash-words .
"high-mango-white-oregon-purple-charlie"
IN: scratchpad human-uuid4 2array .
{
"28129036-75a7-4c87-984b-4b32231e0a0d"
"nineteen-bluebird-oxygen-edward"
}
Implementation
We need a list of 256 words, one to represent each possible byte:
CONSTANT: default-wordlist {
"ack" "alabama" "alanine" "alaska" "alpha" "angel" "apart"
"april" "arizona" "arkansas" "artist" "asparagus" "aspen"
"august" "autumn" "avocado" "bacon" "bakerloo" "batman" "beer"
"berlin" "beryllium" "black" "blossom" "blue" "bluebird" "bravo"
"bulldog" "burger" "butter" "california" "carbon" "cardinal"
"carolina" "carpet" "cat" "ceiling" "charlie" "chicken" "coffee"
"cola" "cold" "colorado" "comet" "connecticut" "crazy" "cup"
"dakota" "december" "delaware" "delta" "diet" "don" "double"
"early" "earth" "east" "echo" "edward" "eight" "eighteen"
"eleven" "emma" "enemy" "equal" "failed" "fanta" "fifteen"
"fillet" "finch" "fish" "five" "fix" "floor" "florida"
"football" "four" "fourteen" "foxtrot" "freddie" "friend"
"fruit" "gee" "georgia" "glucose" "golf" "green" "grey" "hamper"
"happy" "harry" "hawaii" "helium" "high" "hot" "hotel"
"hydrogen" "idaho" "illinois" "india" "indigo" "ink" "iowa"
"island" "item" "jersey" "jig" "johnny" "juliet" "july"
"jupiter" "kansas" "kentucky" "kilo" "king" "kitten" "lactose"
"lake" "lamp" "lemon" "leopard" "lima" "lion" "lithium" "london"
"louisiana" "low" "magazine" "magnesium" "maine" "mango" "march"
"mars" "maryland" "massachusetts" "may" "mexico" "michigan"
"mike" "minnesota" "mirror" "mississippi" "missouri" "mobile"
"mockingbird" "monkey" "montana" "moon" "mountain" "muppet"
"music" "nebraska" "neptune" "network" "nevada" "nine"
"nineteen" "nitrogen" "north" "november" "nuts" "october" "ohio"
"oklahoma" "one" "orange" "oranges" "oregon" "oscar" "oven"
"oxygen" "papa" "paris" "pasta" "pennsylvania" "pip" "pizza"
"pluto" "potato" "princess" "purple" "quebec" "queen" "quiet"
"red" "river" "robert" "robin" "romeo" "rugby" "sad" "salami"
"saturn" "september" "seven" "seventeen" "shade" "sierra"
"single" "sink" "six" "sixteen" "skylark" "snake" "social"
"sodium" "solar" "south" "spaghetti" "speaker" "spring"
"stairway" "steak" "stream" "summer" "sweet" "table" "tango"
"ten" "tennessee" "tennis" "texas" "thirteen" "three" "timing"
"triple" "twelve" "twenty" "two" "uncle" "undress" "uniform"
"uranus" "utah" "vegan" "venus" "vermont" "victor" "video"
"violet" "virginia" "washington" "west" "whiskey" "white"
"william" "winner" "winter" "wisconsin" "wolfram" "wyoming"
"xray" "yankee" "yellow" "zebra" "zulu"
}
One of the inputs is the number of words to produce. An error is produced if fewer bytes are provided than the number of words requested:
ERROR: too-few-bytes seq #words ;
: check-bytes ( seq #words -- seq #words )
2dup [ length ] [ < ] bi* [ too-few-bytes ] when ; inline
Input is grouped into subsequences, where the number of subsequences is the number of words requested in the output. It’s a little bit odd, but basically makes groups and then puts any remainder in the last group:
: group-words ( seq #words -- groups )
[ dupd [ length ] [ /i ] bi* group ]
[ 1 - cut concat suffix ] bi ; inline
Our input bytes are compressed, first by grouping them into words, then by XORing the bytes in each word:
: compress-bytes ( seq #words -- newseq )
check-bytes group-words [ 0 [ bitxor ] reduce ] map ;
Our input will either be a byte-array, or a hex-string with every two characters representing the hexadecimal value of each byte:
: byte-string ( hexdigest -- seq )
dup byte-array? [ 2 <groups> [ hex> ] map ] unless ;
Making a humanhash is simply converting the input, compressing into bytes representing each word, looking up the word from the word list, and joining with a requested separator:
: make-humanhash ( hexdigest #words wordlist sep -- hash )
{ [ byte-string ] [ compress-bytes ] [ nths ] [ join ] } spread ;
We provide a way to hash into a requested number of words, or four by default:
: humanhash-words ( hexdigest #words -- hash )
default-wordlist "-" make-humanhash ;
: humanhash ( hexdigest -- hash )
4 humanhash-words ;
And since the humanhash project includes a way to create humanhash’d uuids, we do also:
: human-uuid4 ( -- uuid hash )
uuid4 dup [ CHAR: - = not ] filter humanhash ;