RSS duplication relief
For my job I’ve been tracking news using the different News Engines…lucky for me they all supply search RSS feeds.
News Engines I use:
…others MSN News, All Headlines News
…I also follow some of the RSS engines as well.
In a previous post I mentioned the difficulty in choosing which source to use or just use them all (and of course Factiva is another ball game).
A problem for me was duplication both within a feed and amongst all the feeds
…I did searches such as: Rio Tinto Australia, but also did other searches such as: Rio Tinto (and selecting Australia in the country news source field in the advanced section)…just to be exhaustive.
So for every news service I had 2 searches, times that by the number of news services (Google News, Yahoo! News, and Topix), and that’s 6 feeds for just one search topic.
If I had 5 search topics, that’s 30 feeds to look through every morning…and again the annoying thing was the duplication in each feed, and across all feeds.
Well the first great help was when Google News added RSS feeds (I was always using watered-down hacks), now every post in my RSS reader includes the collapsed related stories that Google does so well.
…and on top of that you could also get a feed for the whole of your Google Customised News Page, so this is kind of a spliced feed of all your customised sections (so that eliminated 6 feeds to 1).
Now I have a simple idea I haven’t tried yet that will hopefully reduce the duplication, therefore reducing the overload, and sparing me some time. Whack all my feeds into a re-mixer like Feed Digest and turn on the duplication feature
(Filter items with duplicate Titles, or filter items with duplicate URL’s).
This way I will only have to see the same story only once and I’m getting all my content from just one feed
…as for related stories (which aren’t technically duplication), at least from Google News they are being collapsed.
This is also handy for ego searching (checking your incoming links at all the RSS engines)
…instead of subscribing to feeds for incoming links from several RSS engines, just collaspse then into one, and set the duplicates…time saving and efficient…Talkdigger kind of does this.













