sum
Friday, March 25, 2011
Today’s programming
challenge is to implement
the “old Unix Sys V R4” sum
command:
“The original sum calculated a checksum as the sum of the bytes in the file, modulo 216−1, as well as the number of 512-byte blocks the file occupied on disk. Called with no arguments, sum read standard input and wrote the checksum and file blocks to standard output; called with one or more filename arguments, sum read each file and wrote for each a line containing the checksum, file blocks, and filename.”
First, some imports:
USING: command-line formatting io io.encodings.binary io.files
kernel math math.functions namespaces sequences ;
Short Version
A quick file-based version might look like this:
: sum-file. ( path -- )
[
binary file-contents
[ sum 65535 mod ] [ length 512 / ceiling ] bi
] [ "%d %d %s\n" printf ] bi ;
You can try it out:
IN: scratchpad "/usr/share/dict/words" sum-file.
19278 4858 /usr/share/dict/words
The main drawbacks to this version are: loading the entire file into memory (which might be a problem for big files), not printing an error if the file is not found, and not supporting standard input.
Full Version
A more complete version might begin by implementing a function that reads from a stream, computing the checksum and the number of 512-byte blocks:
: sum-stream ( -- checksum blocks )
0 0 [ 65536 read-partial dup ] [
[ sum nip + ] [ length + nip ] 3bi
] while drop [ 65535 mod ] [ 512 / ceiling ] bi* ;
The output should look like CHECKSUM BLOCKS FILENAME
:
: sum-stream. ( path -- )
[ sum-stream ] dip "%d %d %s\n" printf ;
We can generate output for a particular file (printing
FILENAME: not found
if the file does not exist):
: sum-file. ( path -- )
dup exists? [
dup binary [ sum-stream. ] with-file-reader
] [ "%s: not found\n" printf ] if ;
And, to prepare a version of sum
that we can deploy as a binary and
run from the command line, we build a simple MAIN: word:
: run-sum ( -- )
command-line get [ "" sum-stream. ] [
[ sum-file. ] each
] if-empty ;
MAIN: run-sum
The code for this is on my GitHub.