Follow-up to a previous post…thinking of different approaches in making an automated RSS news portal.
Currently I am making a news portal and am wondering which news aggregator are the best bet for the source of my content…I guess I need to use them all and compare the results and go with the news aggregator I like the best as the source for my portal…or I could use all of them.
The idea is that the RSS feeds will directly syndicate the content (to a Blogdigger group) without me having to moderate the quality of the articles…if my search query is precise enough the portal can work automatically with a high quality of relevant content.
The 3 news aggregators I am looking at are Google News, Yahoo! News, and Topix…a search query RSS feed will be generated from each of these sites and will then be fed into a Blogdigger group (acting as the portal)
Before I release the portal I want to see if it is worth having all 3 feeds:
- what is the average quantity of overlap?
- what is the average quantity of unique items?
- how many items per day to read?
- Is there a way for duplicate URL’s to be removed?
Each news aggregator may remove their own duplicates, but then I’ll have to remove the duplicates between the 3 of them…apparently Blogdigger groups doesn’t have this feature.
On this aspect, I wonder if private RSS Readers can remove duplicates…I know Newsgator does something to this effect (does anyone know the process involved?)
- What if I like to use all of them because each of them returns relevant results that the others miss?
Problem is with my current search term I’m getting about an average of about 70 hits in total, this is too many hits for my client to scan and read in 15 minutes everyday.
- I wonder how many hits will be left if duplicates are removed?
- If I can’t remove duplicates, there will be too many hits, so I will have to just use content from only one of these feeds (Google News or Yahoo! News or Topix)
- Even if I do remove duplicates I still may have too many hits for my clients to scan and read in 15 minutes.
- So I will have to go with one feed, ie. use only one news aggregator as the source for my portal.
If this is the case I could use Google News customised and not even need to use a portal.
Or I could have the news according to source by having 3 RSS-to-JAVA boxes on a web page
- Or I could use all three and have them in a drop down menu like EEVL’s - OneStep Industry News…this requires some expertise to develop.
EEVL’s - OneStep Industry News is a great example of a list of news feeds that also shows the latest content and (also is a searchable database) …it’s like a customised version of a Blogdigger group.
The feeds are represented at three levels: top level (all feeds), 2nd level (all feeds within a subject, they are Engineering, Mathematics, Computing, and General Technology), and the 3rd level (all individual feeds).
The only problem is if you want to read this content in your own RSS reader like Bloglines, you can only subscribe to the 3rd level feeds, there isn’t a mega-feed (compilation feed) at the 1st or 2nd level. I guess if you wanted to you could make your own mixed feeds but it would be good if EEVL already had them primed.