Working with CGI: Part 1

Wednesday, January 20, 2010

While Factor can be used to develop many different kinds of programs, some uses just aren’t as common as others, for many reasons.

One such case is its use to develop CGI scripts. The “Common Gateway Interface” (sometimes called “CGI/1.1”) is the documented version of conventions developed for web programming in the late 1990’s. If you are curious, you can read RFC 3875 for more details.

The way it works is simple. The web server:

receives an HTTP request
parses the request headers and payload, and then
calls an application with the request details, which then
renders a response that is then sent back to the client

When developing and testing CGI scripts, it is useful to understand the environment that your program will be running within. For this, we can build a simple Factor program that prints the environment variables it is called with to HTML that can be rendered in a web browser.

Our CGI script should be executable, and on a UNIX system (like Mac OS or Linux) should contain a shebang which indicates what program should process the files contents. This is used to call the Factor interpreter with our CGI script. Note that the shebang has a space after it, which is not required by most interpreters, but is by Factor:

#! /path/to/factor

The vocabularies that we will be using:

USING: assocs environment kernel io namespaces sequences
sorting ;

The first response from a CGI script is typically the HTTP headers, including the type of content that is being returned. Your script could return any content, including images, audio, or video. But in this case, we will just return plain HTML:

"Content-type: text/html\n\n" print

We can then print the HTML header and begin the body:

"""
<html>
<head>
<title>Debug</title>
</head>
<body>
<pre>
""" print

Next, we will get all the environment variables available to our process and print them, sorted alphabetically:

os-envs >alist sort-keys [
    [ "<b>" write first write "</b>" write ]
    [ " = " write second write nl ] bi
] each

And then finish the HTML document with closing tags:

"""
</pre>
</body>
</html>
""" print

If you run this program from the shell, it will print your local user environment. But, when run from a web server, it prints the CGI script’s environment. According to the CGI specification, certain environment variables are used to pass the HTTP request details to the CGI program. Some of the commonly used ones include:

`DOCUMENT_ROOT`	The root directory of your server
`HTTP_COOKIE`	The visitor's cookie, if one is set
`HTTP_HOST`	The hostname of the page being attempted
`HTTP_REFERER`	The URL of the page that called your program
`HTTP_USER_AGENT`	The browser type of the visitor
`HTTPS`	"on" if the program is being called through a secure server
`PATH`	The system path your server is running under
`QUERY_STRING`	The query string (see GET, below)
`REMOTE_ADDR`	The IP address of the visitor
`REMOTE_HOST`	The hostname of the visitor (if your server has reverse-name-lookups on; otherwise this is the IP address again)
`REMOTE_PORT`	The port the visitor is connected to on the web server
`REMOTE_USER`	The visitor's username (for .htaccess-protected pages)
`REQUEST_METHOD`	GET or POST
`REQUEST_URI`	The interpreted pathname of the requested document or CGI (relative to the document root)
`SCRIPT_FILENAME`	The full pathname of the current CGI
`SCRIPT_NAME`	The interpreted pathname of the current CGI (relative to the document root)
`SERVER_ADMIN`	The email address for your server's webmaster
`SERVER_NAME`	Your server's fully qualified domain name
`SERVER_PORT`	The port number your server is listening on
`SERVER_SOFTWARE`	The server software you're using (e.g. Apache)

This is a useful fact for testing, since you can easily simulate the request that the web server will be sending to your CGI script by configuring the environment in the appropriate way. More to come on that later…