…google has joined the real-live web, hooray!
Now from what I can see this is a blog engine only (using RSS to index), it is not a RSS engine for the web…the content is from blogs only…so it is an RSS engine for the blogosphere.
I guess with this premise Google is separating it’s News engine from its Blog engine, and from its Web engine…although the News, and Web engine aren’t indexed via RSS, they take the traditional approach…so Google Blog Search is it’s first RSS engine.
From the website:
“While Google web search has allowed you to limit results to popular blog file types such as RSS and XML in web search results for some time, and its news search includes some blogs as sources, Google hasn’t had a specialized tool to surface purely blog postings”
There still is the limit of completeness in indexing…it finds sites from pinging services then crawls within the RSS feeds…this content in turn delivers the search results
…but this content is only based on the RSS feed, if an RSS feed doesn’t offer full-text, then the blog isn’t indexed completely.
It’s a toss up between having all your content (offering full-text RSS) discovered by search engines and getting traffic that way or driving traffic to your site from people reading your RSS feed via RSS summaries.
More on this at SEW blog…mentions the Technorati approach:
“It’s not FULL TEXT blog search. Huh? If you post to a blog, you might not send out the entire text of your post in a feed. We don’t, for instance. Our reason is that we don’t want everyone assuming they can reprint our material. Jason Calacanis of Weblogs has written of similar issues despite copyright warnings in his full-text feed. But Google’s only currently searching what’s in the feed, meaning that it actually may be ignorant of a huge amount of blog content that’s not pushed in a feed. That produces some skewing, as I found with PubSub back in June.
Ideally, I’d like to see Google do what Technorati does and grab the actual full-text of the post, rather than depend just on the feed. For its part, Google says this is something it’s pondering.”
Also if you do a search like Library, above the top result shows some related blog home pages…whereas Technorati uses it’s new Blog Finder search, I wonder if Technorati would implement some suggestions from it’s Blog Finder search into the general search engine…just like it does with it’s Tag search…see here.
More from the SEW article:
“…it’s not a true full-text search across all sources… this is because some publishers only syndicate excerpts of content via RSS. Google’s blog search indexes all of the content it finds in feeds, but does not attempt to access and index the full content available on a publisher’s web server.”
Relevance or Date ranking
RSS feed or an Atom feed
Limit by Title
Date (choose this for most recent stuff)
Even though the standard operators work such as:
Check out the news ones:
Wow, you can really hone in on a post.
These are on the default sort of relevancy…whereas if I were to subscribe to be notified of recent entries I would sort by date (isn’t this what blogs are about, you’d think date sorting would be the default)
Trying to limit results just to my blog:
inpostauthor:johnt (333 hits)
Not unique enough as there are other authors with the name johnt
blogurl:libraryclips.blogsome.com (194 hits)
Every hit is from my blog, awesome!
allinblogtitle:library clips (194 hits)
Equally awesome! (as long as someone else doesn’t have the same words in their blog)
site:libraryclips.blogsome.com (o hits)
The link search doesn’t work very well compared to other engines, so many of my own URL’s are mixed in the results.
Also I’d like to see a citation link under every hit,
Check out the FAQ.
Haven’t consulted my RSS reader for a day, let’s see if there is already a bookmarklet.