Recursively Listing Files
Tuesday, December 29, 2009
Update: It was pointed out to me that the
recursive-directory-files
word in io.directories.search
solves this problem. Good to know, and
the below article can be thought of as a learning exercise! :)
Factor has several vocabularies for interacting with the filesystem, and quite a lot of work has been done on making these useful. Some interesting blog posts from early 2009 include:
One feature that I haven’t found yet, is a word to simply (and recursively) list files within a directory. This is often useful with the intent of processing some (or all) of the files in some way.
This feature exists in other language standard libraries with different
names. In Python, this is called os.walk()
. In Perl, this is called
File::Find
. Usually, even if it doesn’t exist, it can be built
relatively easily.
I’m going to show you how to create a word to do that with Factor.
Many words within the standard library are implemented by a public word that defers to a private word for part of the functionality. In this case, I want to separate the “setup” functionality of the word, from the “work” functionality.
Let’s define a word list-files
that will recursively list all the
files within a directory. First, we need to create a growable sequence
(in this case a vector
) to hold the result, and then we want to call a
word that will be used to do the actual work.
: list-files ( path -- seq )
[ V{ } clone ] dip (list-files) ;
For each path that we process, we will want to handle differently depending on what type of file the path points to:
- If a symbolic link, we want to read the link and recurse into it.
- If a directory, we want to recurse into each of the files within it.
- If a regular file, we want to add it to the list of files.
Using some of the words in io.directories
, io.files
,
io.files.info
, io.files.links
, and io.files.types
, we can build
such a function:
: (list-files) ( seq path -- seq )
normalize-path dup link-info type>> {
{ +symbolic-link+ [ read-link (list-files) ] }
{ +directory+ [
[ directory-files ] keep
'[ normalize-path _ prepend (list-files) ] each ] }
{ +regular-file+ [ over push ] }
[ "unsupported" throw ]
} case ;
We can setup a directory structure like so:
$ tree -f /tmp/foo
/tmp/foo
|-- /tmp/foo/bar
`-- /tmp/foo/baz
`-- /tmp/foo/baz/foo
1 directory, 2 files
And then use our new function from Factor:
IN: scratchpad "/tmp/foo" list-files .
V{ "/tmp/foo/bar" "/tmp/foo/baz/foo" }
This could be improved further by handling file permissions issues, infinite recursion, and lazily generating the list of files (for better performance with large directory trees).