In Part1, we prepared a large list of RSS feeds and filtered them down to something workable. In Part2, we processed all of the articles in the feeds and presented posts from the last five days as a static RSS reader. In Part3 we used a number of JSON APIs for social networking website to gauge the popularity of articles and highlight those popular posts at the top of the page on our static reader.
In this 4th and final post in the series we will explore different ways to disseminate the results of our filtering.
Step1: Send via Email
Last time in Part3 we ended up with a result that was quite passable. The scripts generated an HTML page that promoted popular AI, Data Mining, Machine Learning, etc. articles from the last 7 days, followed by a listing of all those other articles that, though were deemed to be less popular, may be of interest - organized by day. A simple approach to disseminate the results of this script is to send an email. The objective here is to receive the equivalent of an AI-themed version of the most excellent Hacker Newsletter.
I have a Google Gmail account and I assume most programmers do. The first step is to prepare a script that can generate an HTML email message and use the Gmail SMTP server to send the email. We are not focused on mass distribution here, just emailing the results to ourselves, at the moment, on demand.
The built-in SMTP handling in the Ruby standard library more than meets our needs here. The Gmail SMTP details are also easily obtained. The result is a script with two simple functions: the first for building a standard SMTP message with support for text/html content (mimetype), and the second for connecting to the Gmail SMTP server and posting the email. The script provides a spot test that will ask for your Gmail credentials and use them to send you a hello world email. Easy as pie.
See sendemail.rb
See below for a screenshot of my Gmail inbox with the resulting test email.
Step 2: Cron Send Email
The next step is to prepare a script that can be executed each day, generate a summary of the AIFeed output and email it to you. The easiest way to do this on any Linux or Mac machine is with
cron.
A variation of the
listpopulardayarticles.rb script from
Part3 is used as the basis for the email. A new function is defined that generates the html content of the email and sends it. The script accepts two parameters on the command line: a gmail email address and a gmail password. These credentials are then used to send the email using the script prepared above.
See dailyfeed.rb
To execute the script we can create a shell script that contains the call to the script and the Gmail credentials used to send the email. For example, the shell script may be called
run_dailyfeed.sh and look as follows:
#!/bin/sh
cd /path/AIFeeds/part4/
ruby dailyfeed.rb [email] [password] >> /path/AIFeeds/part4/dailyfeed.log
The script is three lines: a
shebang, change directory to the script location, and a call to the ruby script with parameters. Replace
/path/ with the path to your AIFeed directory, and replace the
[email] and
[password] with your gmail login details. The output of the script (and any errors) are output to a new log file
dailyfeed.log.
The crontab for the current user can be opened as follows:
crontab -e
Add an entry that looks something like the following:
00 5 * * * /path/AIFeeds/part4/run_dailyfeed.sh
This is all one line with tabs in between the fields. Again replace
/path/ with the path to your AIFeed directory. Cron will execute the shell script once each day at 5am local time.
The following is an example of a resulting email sent to my email account.
Improvements and Extensions
This section summarizes possible improvements and extensions to this part in the series.
- Cron is an easy way to schedule a task on your machine. A better approach would be to set this up on a server (such as Heroku, AppEngine, or AWS) and send an email to an email list (via something like MailChimp).
- An interesting extension to this project would be to turn the output into a webpage that is re-generated every hour or so. This might provide a useful diversion to reddit and hacker news, with a targeted corpus of links to scan over.
In this fourth and final part in the series we have hacked together simple script to send email via Google Gmail and scheduled the script to email ourselves a list of popular AI articles each morning at 5am. Not bad for a few
days hours work. Sure, there are some rough edges, but the result is entirely functional and I think useful.
If you would like to see this as a service or perhaps a website, drop me a comment or an email. I'd be happy to clean it up further and automate it for a broader crowd if I knew that others as passionate as me about AI and Machine Learning were interested!
Don't forget all code and data for this series is available on the
AIFeeds github project.