Library clips

sharing ideas thoughts and feedback

February 16, 2007

Searchfeedr : searching the limits and OPML search

Filed under: tools, search

A while back I mentioned to be able to search the full-text of multiple webpages by them being represented by one URL.
This URL could be an OPML URL, just say this OPML URL contains 10 nodes, each node is the home page of a blog. All I would have to do in enter this OPML into the search box, and in the other search box enter a search query…I call this meta-searching via an OPML URL.

In my Search rollers post I mention some tinkerings from Tony Hirst that are very close to what I’m after.

1. Search the full-text of links on a given page (HTML URL), see here.

2. Search full-text of del.icio.us links via one of your del.icio.us tag URL’s, see here.

3. Search the full-text of links listed on a given page (feed URL), see here.

4. Search the full-text of webpages that link to another webpage, see here…also a related search.
(Technorati shows you all the posts that link to your blog, but you can’t then search within this set of posts)

Now these are all rolled into on search interface, see Searchfeedr…also check out the tools.

Here are some examples:

Search pages linked to from a particular page
(ie. search the full-text of the links on a webpage)

Search over sites that link into a particular page
(ie. search the full-text of sites that are talking about (linking to) this site

Search over links pulled in from a Technorati Tag via RSS
(search within the feed content of each site in the search results of a Technorati tag…this is handy has Technorati don’t let you search within a tag)

Search over a del.icio.us tag
(ie. search the full-text in the domain of the last 15 bookmarks in one of my del.icio.us tags, called “OPML”)
I can do this to search full-text in just the pages (not domains), see here.

Not sure if it does this, but Tony once helped me to find one of my blog posts that linked to someone’s else’s blog post. I knew the blog post I linked to, but I couldn’t find my post that was linking to it.
A search failed because as it sometimes turns out when you link to a blog post the label for your hyperlink may be rather generic like “see here” or “more here”, this is very unhelpful later on.

Tony suggested using Alta Vista, see here:
link:http://www.techcrunch.com/2006/01/10/searchfox-to-shut-down url:libraryclips.blogsome.com
But this still didn’t work to well because it didn’t return a post/s it returned a month URL from my blog, I’d have to scan each post hovering over all the links…at least it gets you in the ball park.

Can Searchfeedr do this?

Tony looked at my original question in this post of searching via an OPML URL, but it seems a lot of work.

I’ve mentioned this before, but in the old days (last year), Feedster would let you search across and OPML URL…it would search in the feed and not the HTML, but at least it was something.

For now

My idea is whenever you come across an OPML, or your own OPML you can quickly whack it in a search engine and search it, but for now here are the manual alternatives.

1. Create a new account at an RSS Reader eg. Bloglines, and search this OPML, if you make it public others can search this at the advanced search page…this is a long process if you just wanted to search this OPML once in your life, same with the solutions below.

2. Similar to point 1, register a Public RSS Reader, import the OPML, and done, now you can search across this OPML, eg. Blogdigger Groups, MySyndicaat, Feed Collectors, Technorati Favourites, etc…

3. Save the OPML URL you came across, to your PC, then upload it to Google CSE…this is creating your own search engine, in order to search all the sites in this OPML. From time to mind you would have to reload this OPML to make sure it is the latest version.

4. CleverClogs Grazr mashup allows you to search in an OPML

As you can see these methods require way too much effort if your intention is a quick once only search.

Another thing to consider is just say this OPML has other OPML’s within in (these are called OPML includes)
eg. Imagine if the Yahoo! Directory was wrapped in OPML, and each topic was it’s own OPML, so you would have lots of OPML’s within the one mother OPML.
The the idea is to search the mother OPML, or just choose one of the topic OPML’s to search, or maybe select a few topic OPML’s to search across.

You can already search in the Yahoo! Directory, but just say you come across any old OPML, and because this OPML has lots of OPML includes it is considered an OPML Directory (lots of the nodes in this OPML outline are other OPML’s). I want to be able to enter any of these OPML nodes (includes) or the entire OPML into a search engine and search it.

At the moment anyone can make their own Yahoo! Directory using OPML outlines, each topic being an OPML include, but now we need to be able to plug this in to search across it (even limited to a topic).

Related:
OPML on the fly
Search in an OPML
Search your blog
My OPML wishlist

Get free blog up and running in minutes with Blogsome | Theme designs available here