Reddit "Top"
Monday, January 17, 2011
Reddit has an
API that can be used for accessing
much of the information available through their website. We can retrieve
a JSON list of recent stories posted to any
subreddit by going to https://api.reddit.com/r/$NAME
. You can
experiment with this in the Factor
listener - to retrieve top stories for the
programming subreddit:
IN: scratchpad USING: http.client json.reader ;
IN: scratchpad "https://api.reddit.com/r/programming"
http-get nip json> .
Someone once used the API to build a reddit-top program for monitoring top stories from the console. We will use Factor vocabularies to scrape Reddit and produce something similar:
We start by building a (subreddit)
helper word to retrieve the JSON
response for a particular subreddit, extracting the top stories, and
returning an array of hashtables (one for each of the top stories).
: (subreddit) ( name -- seq )
"https://api.reddit.com/r/%s" sprintf http-get nip
json> { "data" "children" } [ swap at ] each
[ "data" swap at ] map ;
We can then define a story
tuple, with a slot for each attribute
returned by the API.
TUPLE: story author clicked created created_utc domain downs
hidden id is_self levenshtein likes media media_embed name
num_comments over_18 permalink saved score selftext
selftext_html subreddit subreddit_id thumbnail title ups url ;
Once we have that, we can use the set-slots
word from my previous post
on setting
attributes
to build a subreddit
word that retrieves the top stories as objects:
: subreddit ( name -- stories )
(subreddit) [ story new [ set-slots ] keep ] map ;
Thats all we need to build the subreddit-top
word demonstrated in the
beginning:
- Retrieve the top stories for a given subreddit.
- Loop over each story.
- Format and print the relevant attributes.
: subreddit-top ( subreddit -- )
subreddit [
1 + "%2d. " printf {
[ title>> ]
[ url>> ]
[ score>> ]
[ num_comments>> ]
[
created_utc>> unix-time>timestamp now swap time-
duration>hours "%d hours ago" sprintf
]
[ author>> ]
} cleave
"%s\n %s\n %d points, %d comments, posted %s by %s\n\n"
printf
] each-index ;
This (and some code for users and comments) is available on my GitHub.