I notice once in a while in my RSS aggregator that I am getting feeds that are sometimes 2 months old. Why would the aggregator be lagging on such stale news.
Y’know, I’ve noticed that this happens too with some frequency on my RSS reader, NewsGator, and I decided to dig into it a bit to find out what’s going on.
It turns out that the algorithm that RSS readers or aggregators use to figure out what snippets of news you haven’t seen is pretty simplistic; it’s all based on whether the last modification date is more recent than the last time you read the news from that site. Ordinarily, that’d work perfectly because just as with a newspaper, once an RSS feed from a Web site or weblog adds a new item, it never changes and you only see it once.
But what do you think would happen if the site administrator is monkeying with the backend software that generates the feed and changes a seemingly irrelevant HTML tag within the RSS feed from, say <br> to <br />? If you’re thinking that the last modified time changes, then you’re exactly right.
In fact, that’s mostly what happens when older RSS articles are presented as new content again, as far as I can ascertain.
Of course, some RSS feed authors will actually go back and modify older articles too, perhaps adding a forward link to a newer article on the subject, fixing a broken URL, or engaging in a little bit of historical revisionism. In all those cases, the entry in the RSS reader will appear again, even if it’s weeks or even many months old.
Finally, when a site with an RSS feed, particularly a blog site with its feed enabled, moves to a new server or a new version of its backend software, every file on the server is ‘touched’ and will then have a newer last modified date. Therefore, every entry in the feed suddenly appears in your RSS reader for no apparent reason.
At this point in time, it’s just the way the system works. Since some sophisticated RSS readers (notably NetNewsWire) actually archives RSS data so it can show you change marks and you can see how a specific entry in an RSS feed evolves over time, it’s theoretically possible for these same readers to suppress the presentation of redundant RSS entries entirely.
Hope that helps clarify things!
If you’d like more information about Really Simple Syndication and how to work with it, I also have a variety of different articles available on this topic, which you can find by searching for RSS information on this site.