Searchfeedr : searching the limits and OPML search
A while back I mentioned to be able to search the full-text of multiple webpages by them being represented by one URL.
This URL could be an OPML URL, just say this OPML URL contains 10 nodes, each node is the home page of a blog. All I would have to do in enter this OPML into the search box, and in the other search box enter a search query…I call this meta-searching via an OPML URL.
In my Search rollers post I mention some tinkerings from Tony Hirst that are very close to what I’m after.
1. Search the full-text of links on a given page (HTML URL), see here.
2. Search full-text of del.icio.us links via one of your del.icio.us tag URL’s, see here.
3. Search the full-text of links listed on a given page (feed URL), see here.
4. Search the full-text of webpages that link to another webpage, see here…also a related search.
(Technorati shows you all the posts that link to your blog, but you can’t then search within this set of posts)
Now these are all rolled into on search interface, see Searchfeedr…also check out the tools.
Here are some examples:
Search pages linked to from a particular page
(ie. search the full-text of the links on a webpage)
Search over sites that link into a particular page
(ie. search the full-text of sites that are talking about (linking to) this site
Search over links pulled in from a Technorati Tag via RSS
(search within the feed content of each site in the search results of a Technorati tag…this is handy has Technorati don’t let you search within a tag)
Search over a del.icio.us tag
(ie. search the full-text in the domain of the last 15 bookmarks in one of my del.icio.us tags, called “OPML”)
I can do this to search full-text in just the pages (not domains), see here.
Not sure if it does this, but Tony once helped me to find one of my blog posts that linked to someone’s else’s blog post. I knew the blog post I linked to, but I couldn’t find my post that was linking to it.
A search failed because as it sometimes turns out when you link to a blog post the label for your hyperlink may be rather generic like “see here” or “more here”, this is very unhelpful later on.
Tony suggested using Alta Vista, see here:
link:http://www.techcrunch.com/2006/01/10/searchfox-to-shut-down url:libraryclips.blogsome.com
But this still didn’t work to well because it didn’t return a post/s it returned a month URL from my blog, I’d have to scan each post hovering over all the links…at least it gets you in the ball park.
Can Searchfeedr do this?
Tony looked at my original question in this post of searching via an OPML URL, but it seems a lot of work.
I’ve mentioned this before, but in the old days (last year), Feedster would let you search across and OPML URL…it would search in the feed and not the HTML, but at least it was something.
For now
My idea is whenever you come across an OPML, or your own OPML you can quickly whack it in a search engine and search it, but for now here are the manual alternatives.
1. Create a new account at an RSS Reader eg. Bloglines, and search this OPML, if you make it public others can search this at the advanced search page…this is a long process if you just wanted to search this OPML once in your life, same with the solutions below.
2. Similar to point 1, register a Public RSS Reader, import the OPML, and done, now you can search across this OPML, eg. Blogdigger Groups, MySyndicaat, Feed Collectors, Technorati Favourites, etc…
3. Save the OPML URL you came across, to your PC, then upload it to Google CSE…this is creating your own search engine, in order to search all the sites in this OPML. From time to mind you would have to reload this OPML to make sure it is the latest version.
4. CleverClogs Grazr mashup allows you to search in an OPML
As you can see these methods require way too much effort if your intention is a quick once only search.
Another thing to consider is just say this OPML has other OPML’s within in (these are called OPML includes)
eg. Imagine if the Yahoo! Directory was wrapped in OPML, and each topic was it’s own OPML, so you would have lots of OPML’s within the one mother OPML.
The the idea is to search the mother OPML, or just choose one of the topic OPML’s to search, or maybe select a few topic OPML’s to search across.
You can already search in the Yahoo! Directory, but just say you come across any old OPML, and because this OPML has lots of OPML includes it is considered an OPML Directory (lots of the nodes in this OPML outline are other OPML’s). I want to be able to enter any of these OPML nodes (includes) or the entire OPML into a search engine and search it.
At the moment anyone can make their own Yahoo! Directory using OPML outlines, each topic being an OPML include, but now we need to be able to plug this in to search across it (even limited to a topic).
Related:
OPML on the fly
Search in an OPML
Search your blog
My OPML wishlist














Hey, sorry for the comment on an old post. Tony Hirst pointed me in your direction because I was asking him similar questions. Have you found anything new that would allow you to constrain a search across the sites contained in an OPML (and have that OPML URL subscribed, not imported a la Google CSE)? It appears that you can subscribe Google Coop in the form of “Linked Custom search Engine” (cf. http://www.google.com/coop/docs/cse/cref.html) but this isn’t OPML but something specific to the Coop engine. Anyways, this remains a point of interest for me but I have yet to find the perfect solution. Cheers, Scott Leslie
Comment by Scott Leslie — October 22, 2007 @ 9:20 pm
Hey Scott,
What a question…exactly what I’m still asking…
I mentioned since OPML’s inception, can a public RSS Reader or search engine subscribe to my OPML, not bulk load it, but subscribe to it.
Sorry, but I can’t recall anything new, but search my blog, as I have posted lots and may have forgotten.
Marjolein from the CleverClogs blog could perhaps help us, I’ll send her a tweet.
Marjolein makes cool Grazr mashups where it will subscribe to an OPML, and she also puts in a search box. But I’m not sure if the search mashup is searching in a spliced feed or the OPML.
http://libraryclips.blogsome.com/2007/10/03/grazr-hack-to-the-max/
At the moment the only service I can think about is the BlogBridge RSS Reader, this can subscribe to an OPML, not sure if you can search in it, and plus it’s a personal/private RSS Reader.
What else, Lijit is more of a lifestream social filter search service, no go on that one, even though you can subscribe to another Lijit user…same goes with Spokeo
http://libraryclips.blogsome.com/2007/10/11/searching-your-social-filter/
Comment by Johnt — October 22, 2007 @ 11:55 pm
Scott,
When you subscribe to an OPML file using Lijit it does check it for changes regularly. Take for instance, the Gnomedex 2007 attendees list:
http://www.lijit.com/users/gnomedex07
As more people signed up for the event, their blogs were added to the OPML and then picked up by the Lijit search engine. Additionally, if you have a del.icio.us bookmark feed, as you add bookmarks those sites would be added.
If you have any questions, you can email me andy at lijit
-A-
Comment by Andy Stanberry — October 23, 2007 @ 2:49 am
thanks for inviting me to comment.
An OPML is only a list of feed names and feed URLs, so technically searching against an OPML wouldn’t make sense.
What you could do is to filter before splicing. It’s actually an approach that makes a LOT of sense, especially if you were going to filter that spliced feed to begin with. The big advantage I see is that you end up with a considerably larger, but still relevant spliced feed.
I don’t know of a tool that allows to filter the individual feeds in a remotely hosted OPML before they are spliced, but as I write this, I’m going to send in a feature request to the folks behind mySyndicaat and see what they say about the feasibility of this.
If you still want to pursue filtering before splicing right now, then you’d typically take each of the individual feeds, filter them, and eventually splice them. You’d be losing the dynamic nature of the original OPML file.
To what extent does this answer your question? P
Please feel free to ping me @cleverclogs on Twitter if I happen to fail to check this thread, ok?
Comment by Marjolein Hoekstra — October 26, 2007 @ 10:13 am
Marjolein,
Remember Feedster where you could enter an OPML URL in the search box, then enter a keyword in the other search box, isn’t this searching in an OPML URL.
I guess this OPML had to have just feeds as nodes, if it had an OPML include as a node I don’t think it would search in the OPML as well.
I recall perhaps FeedRinse lets you filter each feed in an OPML.
Comment by Johnt — October 27, 2007 @ 12:42 am