Re: Factor

Factor: the language, the theory, and the practice.


Tuesday, March 7, 2023

#pdf #text

Vasudev Ram has a blog with many different posts about various programming topics including Python, Linux, SQL, and PDFs. On the topic of PDF generation, they have a blog post about making an ASCII Table to PDF with xtopdf.

Recently, I had the need for an ASCII table lookup, which I searched for and found, thanks to the folks here:

That gave me the idea of writing a simple program to generate an ASCII table in PDF. Here is the code for a part of that table - the first 32 (0 to 31) ASCII characters, which are the control characters:

It might not be widely known, but Factor has built-in support for writing to PDF Streams using the formatted output protocol. This supports text styles including changing font names, bold and italic styles, foreground and background colors, etc.

We start by defining the symbols and descriptions of the first 32 ASCII characters. These are all non-printable control character, which is why we use this array of strings to render them in a table.

    "NUL Null char"
    "SOH Start of Heading"
    "STX Start of Text"
    "ETX End of Text"
    "EOT End of Transmission"
    "ENQ Enquiry"
    "ACK Acknowledgment"
    "BEL Bell"
    "BS Back Space"
    "HT Horizontal Tab"
    "LF Line Feed"
    "VT Vertical Tab"
    "FF Form Feed"
    "CR Carriage Return"
    "SO Shift Out / X-On"
    "SI Shift In / X-Off"
    "DLE Data Line Escape"
    "DC1 Device Control 1 (oft. XON)"
    "DC2 Device Control 2"
    "DC3 Device Control 3 (oft. XOFF)"
    "DC4 Device Control 4"
    "NAK Negative Acknowledgement"
    "SYN Synchronous Idle"
    "ETB End of Transmit Block"
    "CAN Cancel"
    "EM End of Medium"
    "SUB Substitute"
    "ESC Escape"
    "FS File Separator"
    "GS Group Separator"
    "RS Record Separator"
    "US Unit Separator"

The core printing logic is a header, followed by rows for each character, formatted into a table of decimal, octal, hexadecimal, and binary values along with their symbol and description from the array above:

: ascii. ( -- )
    "ASCII Control Characters - 0 to 31" print nl
    ASCII [
        1 + swap [
                [ >dec ]
                [ >oct 3 CHAR: 0 pad-head ]
                [ >hex 2 CHAR: 0 pad-head ]
                [ >bin 8 CHAR: 0 pad-head ]
            } cleave
        ] dip " " split1 6 narray
    ] map-index {
        "DEC" "OCT" "HEX" "BIN" "Symbol" "Description"
    } prefix format-table unclip
    H{ { font-style bold } } format nl
    [ print ] each ;

Since the UI listener supports formatted streams, you can see it from the listener:

Outputting this to a PDF file is now easy. We make sure to set the font to monospace and then run ascii. with our PDF writer, saving the generated PDF output into a file.

: ascii-pdf ( path -- )
        H{ { font-name "monospace" } } [ ascii. ] with-style
    ] with-pdf-writer pdf>string swap utf8 set-file-contents ;

We also support writing to HTML streams in a similar manner, so it would be pretty easy to create an ascii-html word to output an HTML file with the same printing logic above but instead using our HTML writer.