I keep thinking that Google would hide this, but is there any way to figure out the last time that the Google search engine spider (Googlebot) visited a specific Web page? That’d be particularly useful for competitive research…
It turns out that there’s a bunch of interesting information you can glean from Google search results, information that’s there, but not necessarily obvious. Indeed, it’s often the case that search engines make extra information available, we just gloss past it in our zeal to find the answer to our specific search questions.
Critical to remember is that Google keeps its own copy of the Web pages it indexes in its own local store, called its “cache”, so examining that information will tell us what we want to know. But let me show you, rather than just talk about it.
For example, let’s do a quick Google search on iphone help and see what comes up…
Hark! It’s my own site (yeah, that was a plant. 🙂
It’s the last line I want you to focus upon, where it shows the URL of the page followed by the approximate page size, a link to the cached copy of the page (which is critically important for our detective effort!), a way to search for similar pages in the Google engine and, finally, if you’re logged in to your Google account, the “Note this” link is a simple way to add this link to your Google Notebook.
Click on the Cached link and you’ll see an archived copy of the Web page from Google’s internal storage system, with a long, complicated header that’s oh-so-important:
Notice the first line: “This is G o o g l e’s cache of the URL as retrieved on Apr 1, 2008 07:00:18 GMT.”
So there’s your answer. This particular page was last crawled by Google on 1 April, 2008. Since I’m writing this on 4 April, this means that Google’s snapshot of this page is three days old.
After viewig the snapshot on cache, how do you get rid of it?