Compressed Images
Monday, May 27, 2024
A recent contribution by @nomennescio enables support for loading compressed images in Factor!
I’m not talking about graphical images, but rather about the binary image that Factor uses to load from. Specifically, the binary image includes mainly the data and code heaps as well as some special objects that are used to initialize the Factor libraries.
The compressed image support uses the image_header
to communicate that a
newer compressed version of the Factor binary image should be loaded, instead
of an uncompressed one. We currently use the
Zstandard compression method, which offers
a reasonable balance of speed and compressibility.
Compressibility
The released Factor binary image containing a reasonable default list of vocabularies to be loaded is around 127 megabytes (compressed to 20 megabytes).
127M factor.image
20M factor.image.compressed
One of the criticisms that we have received in the past is that a load-all image that loads the over 300,000 lines of Factor code in the main Factor repository can be almost 500 megabytes. While compressed, that gets significantly reduced down to 66 megabytes!
483M factor.load-all.image
66M factor.load-all.image.compressed
Performance
This is not without some cost: there is a small runtime delay when starting the Factor binary using a compressed image. For example, we can compare uncompressed and compressed results of loading a default image and doing nothing:
$ time ./factor -i=factor.image -e=""
real 0m0.105s
user 0m0.048s
sys 0m0.057s
$ time ./factor -i=factor.image.compressed -e=""
real 0m0.281s
user 0m0.230s
sys 0m0.050s
Or compare the results when using a load-all image:
$ time ./factor -i=factor.load-all.image -e=""
real 0m0.515s
user 0m0.258s
sys 0m0.257s
$ time ./factor -i=factor.load-all.image.compressed -e=""
real 0m1.042s
user 0m0.809s
sys 0m0.233s
That is not quite an apples-to-apples comparison, as the uncompressed version
uses mmap
and likely does not fully cache or page it all in, but the
uncompressed image is fully uncompressed. However, it gives you a sense of
where this feature is heading.
Deploy
If you run "hello-world" deploy
you can create a relatively small deployed
binary that prints Hello world when run. This can then be compressed
manually, to see the difference in size (~25%) with negligible differences in
runtime:
$ du -h hello-world*
1.8M hello-world
1.3M hello-world-compressed
$ time ./hello-world
Hello world
real 0m0.005s
user 0m0.001s
sys 0m0.004s
$ time ./hello-world-compressed
Hello world
real 0m0.005s
user 0m0.001s
sys 0m0.003s
Some additional work needs to be done to add support in the deploy tools for a checkbox to create binaries using compression, however this already represents a big win for anyone that’s more concerned about file sizes than startup latency.
Compression is currently supported using the tools.image.compressor vocabulary and uncompression using the tools.image.uncompressor vocabulary. This is a new feature and might change as it evolves, but this is a neat preview of things to come in the next release.
Give it a try!