
Can I track an RSS feed with a shell script?
More than once, readers have written to me, asking if it was possible to track an RSS feed from a Weblog or news site with a shell script. Sounds kinda wacky, but in fact, it's a very good use of a shell script, as the following rather extensive entry -- including source code! -- demonstrates. If you're a bit confused by the following, you might want to consider picking up a copy of my best-selling Wicked Cool Shell Scripts.
The following article originally appeared at MacDevCenter and is reprinted with permission.
Tapping RSS with Shell ScriptsIf you're like me, you want to keep up with the latest news and information. Shell scripts help me do just that. In this article I'll show you how I wrote a shell script that watches the news at Slashdot.org and automatically shows me the latest story headlines every time I launch a Terminal application. First Things FirstBefore any shell script work begins, the first step is to figure out the URL of the RSS page on Slashdot.
The Slashdot home page doesn't make it particularly easy to find, but the very bottom line, the very rightmost link, is "rss", and the URL behind that link is http://slashdot.org/index.rss. To look at it from within the Terminal, I'm going to utilize the powerful curl application, piping the output to head to ensure that I'm not drowned in output:
Yes, this looks fairly scary as output goes, I admit, but with a little help from the grep utility, this can quickly become a lot more user-friendly. In this case, let's just pull out the lines that are tagged as either the <title> or the <description>:
Not bad. In fact, that's really almost all we need. So let's turn this into a shell. Headlines OnlyTo turn this command line into a shell script is a breeze: just open up your favorite Terminal command-line editor (I use vi but I've been trapped in Unix since 1980 so it's already subverted my neural pathways. You might prefer pico or even BBEdit or similar) Whichever you choose, type in the following, a standard shell script preamble:
This tells the operating system that when this particular file is executed, it should be given to the shell (sh) to be run. Then let's create a variable that contains the URL:
Now we can reference
This script produces the output already seen, so let's make two tweaks
to it so it's more useful. First off, the first three lines of output,
the Slashdot title and description, never change so it'd be just as
easy to strip them out of the output. This can be done a variety of
ways, but I'm going to turn to the
Notice the trailing backslash here: rather than have our command pipe stretch longer and longer, the backslash (which must be the very last character on the line) let's me wrap the command to multiple lines and make it generally more readable. We're getting close to trying the script. The only other tweak worth
making is to strip out the
The XML tags are effectively stripped out, except the
This shows the top two stories (4 lines = two titles + two descriptions). Not bad. Not beautiful, but certainly functional for a first script. I always spend way too much time fine-tuning scripts to get just the output I want, so let's continue working on this to ensure that the output is more readable, shall we? It's so easy, you'll be amazed:
The results, piped through head again:
The problem now is that the Headlines, As Many As You WantThe obvious solution is to add a command flag that lets you specify how
many headlines you want: multiply it by two and you'll know what value
to feed
Now I can specify that I only want the top headline, the newest entry
on the Slashdot site, by simply specifying '
That's pretty cool, I think. I could tweak it forever, but let's stop
here and see how to turn this into a Unix command just like
Turning It Into a CommandThere are two ways to turn a shell script into a command: create an alias or make the script executable and ensure it's in your PATH. To create an alias, if you're using Bash, an alias can be created like this:
Then you can see the headlines by just typing To make the shell script itself executable, first make sure you've saved it in a directory that's in your PATH by typing:
You can see that my PATH includes /Users/dt/bin - that's where I save this script and similar. Once it's in the right place, you'll need to make it executable by using the chmod command:
Optionally, you could rename the script to be a bit more friendly, of course. Finally, Having It Auto-Execute Upon Terminal LaunchIf you're running the Bash shell, which you probably are if you're in Panther, then it's a breeze: move to your home directory and append an invocation of the script to your .bash_login file:
Make extra sure that you use two Now the next time you start up a Terminal application window, you'll see:
It's also worth noting that this use of shell scripts to parse and format XML has more applications. For example, go to http://www.casino-bookstore.com/ and have a close look at the "Latest Gambling News" box: it's using almost an identical script to keep track of the gambling news XML feed from about.com. Another example? Go to http://www.healthy-bookstore.com/ and look at the medicinenet news feed. Again, it's using curl and sed to turn the XML data into HTML data.
Help others find this article at Del.icio.us, Digg, Netscape, Reddit, and Simpy.
Categorized:
Blogs and RSS Feeds
, Shell Script Programming
(Article 3762)
Tagged: Previous: Can I get out of an overly restrictive non-compete clause in a publishing contract? Next: Can I list my iTunes Library by number of albums? Subscribe!
Never miss another useful Q&A article again! Subscribe to AskDaveTaylor with Google Reader. Hi, Dave this is nice to have a script that monitors rss feeds, but i think it is even nicer If you can output it on your desktop,for example with conky or torsmo. You can also use your gmail account atom feed to monitor your inbox, just wget'in and sed'in. I'd like to ask You how to makke my gmail password maximally secure, when it is stored in a shell script. Thank You in advance jot Posted by: jot at May 20, 2006 12:00 PMIn a shell script I am running a command.If the command is failed after some time I want to run this command untill unless the command is successed.Can any one give some suggessation how to do it?. Posted by: Ajit Kumar Sahoo at February 26, 2007 11:08 PMIn a shell script I am running a command.If the command is failed after some time I want to run this command untill unless the command is successed.Can any one give some suggessation how to do it?. Ans:: for example
Your general approach, checking $?, is correct. What's wrong with your script? Posted by: Dave Taylor at April 19, 2007 11:59 AMHello, Is it possible to add more sites with rss feed than one in the same script? Posted by: Hiho at July 22, 2007 9:45 AMI have a lot to say, but ...
I do have a comment, now that you mention it!
|
Search
Find just the answers you seek from among our 1700+ free tech support articles by using our Lijit search engine.
Help!
Subscribe to
Ask Dave Taylor!
Free Updates!
Sign up and get free weekly updates and special offers on books, seminars, workshops and more.
Articles and Reviews
Auctions and Online Shopping Blogs and RSS Feeds Building Web site traffic Business and Management Cell Phones and Mobile Phones CGI Scripts and Web Site Programming Computer and Internet Basics d) None of the Above HTML and CSS Mac OS X Help MySpace, Facebook, Twitter and Social Network Help Pay Per Click (PPC) Search Engine Optimization Shell Script Programming Sony PSP, MP3 Players, Etc. The Writing Business Unix and Linux Help Video Game Tips and Help Windows Help
Recent Entries
Join the List!
Book Links
|