Can I automate craigslist searches?
I want to know if there's some way to automate searching the craigslist site with a shell script or similar so that I can keep an eye on it and know when certain rare auction items show up for sale?
This is a very fun question because it flashes back to one of my first startups, a company called iTrack, which was built around a Web-based service that automated searching and tracking auction items available on all the major auction sites (eBay, Amazon, Yahoo, etc). That, however, was years ago. The basic idea remains, however, and it's delightfully simple.
Let's dig into the craigslist site first, to see how it encodes searches.
Rather than a single site, Craigslist is really broken down into about 50 different regional datasets, all addressed by subdomain. For example, "denver.craigslist.org" and "boulder.craigslist.org" cover the Denver, Colorado and Boulder, Colorado markets, respectively. A search of one regional market, however, doesn't reveal potential matches in any other. Is that a limitation? Doesn't really matter, that's just how the site's designed.
This means that the very first step is to do a search and see what the resultant URL looks like. To search for "Roland Drum" on the Boulder site, for example, here's the URL:
As you can see, the regional domain name shows up here, and the query shows up at the end of the URL.
However, it's slightly more tricky, because if you want to limit the searches to just match those Craigslist items where the title matches, you'll have to do just a bit more experimentation and find out that now the URL has a few more fields:
Turns out that we aren't specifying min or max price so those parameters can be scrubbed out, leaving the resultant URL:
So one way you could do this is to simply bookmark this URL and any time you want to check, just click directly to the search results.
That's not very automated, though, so let's dig into what a simple shell script that does this search might look like. I like to use curl to grab web pages so the base script might be:
curl -s "$url"
Problem with this is that the output is raw HTML and, needless to say, it's not designed to be easily parsed and analyzed, so after looking closely at the code, it turns out that there are some patterns that let you weed out the matches and omit everything we don't want. The key pattern is href="/msg/ which you can apply by using it in a "grep". Then watch how I break every HTML tag onto its own line with a "sed" invocation too:
curl -s "$url" | \
grep 'href="/msg/" | \
Once you've got that, you now just need to screen out the HTML tags and lines that aren't interesting, which can be again done with another "grep", but this time we'll use a regular expression. Ready? Here ya go:
curl -s "$url" | \
grep 'href="/msg/" | \
</g' | \
grep -v -E '(</a>|<i>|</font>|href="/msg/"|</i>|</p>|span>|class="p"|<p>|font size)' | \
grep "<a href="
Looks rather complex, I admit, but if you run it with the broader search pattern "drums", here's what you get:
/msg/728672846.html -- Cuban Style Cajon Drums - $500 -
/msg/728630582.html -- African Dun Dun Drums - $450 -
/msg/724851366.html -- --------Drums and Drum Stuff!---------- -
/msg/706689489.html -- DRUM KIT, Yamaha ELECTRIC DRUMS, DTXPLORER, FUN! FUN! - $575 -
/msg/694513136.html -- Drums --Slingerland Drum Set - $350 -
/msg/679106717.html -- Kid Drums Traps kindly used - $50 -
My final script has a few additional refinements, like making the URL clickable, and it wouldn't be too insanely difficult to save the search result each night and run a "diff" each night against the previous day's output. If there's something new, it could be emailed to you directly. It's what we call an "exercise best left to the reader".
Hope that this gives you a productive path to travel on the way to creating the script you seek!
More Useful Shell Script Programming Articles:
✔ Secretly capture screenshots on my Mac?
When I used to work on a Linux system, there was a utility we had that would let me take screen captures every...✔ Parsing "id" strings in a Shell Script?
Hello Dave. I need a Bash shell script that creates a directories with the group names automatically when user logs in to the...✔ Copy and Paste from the Mac OS X Command Line?
I am constantly running commands in Terminal.app on my MacBook and then copying and pasting the results into email messages or documents. Yes,...✔ Script to test line lengths for Twitter compatibility?
I've been tasked with writing a series of tweets for a Black Friday marketing campaign and am finding it a bit tricky because...✔ Shell script to convert lowercase to title case?
As part of a project I'm working on, I find myself deep in a Linux shell script, needing to have a subroutine that...
Let's stay in touch!
Sign up for my weekly AskDaveTaylor Newsletter and you'll receive even more tech and gadget help right to your inbox, along with exclusive news and industry updates. It's good stuff. I promise!
I do have a comment, now that you mention it!
Check This Out Too...
Look for Answers
All Our Categories
Apple iPad Help
Articles and Reviews
Auctions and Online Shopping
Blogs and Blogging
Building Web Site Traffic
Business and Management
Computer and Internet Basics
d) None of the Above
Google Gmail Help
Google Plus Help
Industry News and Trade Shows
iPhone and Cell Phone Help
iPod, Sony PSP and MP3 Player Help
Kindle Fire Help
Mac OS X Help
Pay Per Click (PPC) Advertising
Search Engine Optimization (SEO)
Shell Script Programming
Tech Support Video Help
The Writing Business
Twitter, LinkedIn and Social Network Help
Unix and Linux Help
Video Game Tips and Help
Windows PC Help
Find Me on Google+
ADT on G+