Google Blog Search!
Search Engine Watch has the inside story about Google releasing the beta of their new blog search service
…google has joined the real-live web, hooray!
Now from what I can see this is a blog engine only (using RSS to index), it is not a RSS engine for the web…the content is from blogs only…so it is an RSS engine for the blogosphere.
Feedster does this by separating it’s RSS web engine from it’s RSS blog engine.
I guess with this premise Google is separating it’s News engine from its Blog engine, and from its Web engine…although the News, and Web engine aren’t indexed via RSS, they take the traditional approach…so Google Blog Search is it’s first RSS engine.
From the website:
“While Google web search has allowed you to limit results to popular blog file types such as RSS and XML in web search results for some time, and its news search includes some blogs as sources, Google hasn’t had a specialized tool to surface purely blog postings”
There still is the limit of completeness in indexing…it finds sites from pinging services then crawls within the RSS feeds…this content in turn delivers the search results
…but this content is only based on the RSS feed, if an RSS feed doesn’t offer full-text, then the blog isn’t indexed completely.
It’s a toss up between having all your content (offering full-text RSS) discovered by search engines and getting traffic that way or driving traffic to your site from people reading your RSS feed via RSS summaries.
More on this at SEW blog…mentions the Technorati approach:
“It’s not FULL TEXT blog search. Huh? If you post to a blog, you might not send out the entire text of your post in a feed. We don’t, for instance. Our reason is that we don’t want everyone assuming they can reprint our material. Jason Calacanis of Weblogs has written of similar issues despite copyright warnings in his full-text feed. But Google’s only currently searching what’s in the feed, meaning that it actually may be ignorant of a huge amount of blog content that’s not pushed in a feed. That produces some skewing, as I found with PubSub back in June.
Ideally, I’d like to see Google do what Technorati does and grab the actual full-text of the post, rather than depend just on the feed. For its part, Google says this is something it’s pondering.”
Also if you do a search like Library, above the top result shows some related blog home pages…whereas Technorati uses it’s new Blog Finder search, I wonder if Technorati would implement some suggestions from it’s Blog Finder search into the general search engine…just like it does with it’s Tag search…see here.
More from the SEW article:
“…it’s not a true full-text search across all sources… this is because some publishers only syndicate excerpts of content via RSS. Google’s blog search indexes all of the content it finds in feeds, but does not attempt to access and index the full content available on a publisher’s web server.”
Offers:
Relevance or Date ranking
RSS feed or an Atom feed
Limit by Title
Author
Date (choose this for most recent stuff)
Even though the standard operators work such as:
link:
site:
intitle:
inurl:
Check out the news ones:
inblogtitle:
inposttitle:
inpostauthor:
blogurl:
Wow, you can really hone in on a post.
Example
These are on the default sort of relevancy…whereas if I were to subscribe to be notified of recent entries I would sort by date (isn’t this what blogs are about, you’d think date sorting would be the default)
Trying to limit results just to my blog:
inpostauthor:johnt (333 hits)
Not unique enough as there are other authors with the name johnt
blogurl:libraryclips.blogsome.com (194 hits)
Every hit is from my blog, awesome!
allinblogtitle:library clips (194 hits)
Equally awesome! (as long as someone else doesn’t have the same words in their blog)
site:libraryclips.blogsome.com (o hits)
Doesn’t work.
link:libraryclips.blogsome.com
The link search doesn’t work very well compared to other engines, so many of my own URL’s are mixed in the results.
A refresher on keyword and link searching with RSS engines:
KeyWord Search Comparison Chart
PDF file of comparison of how Blog search work
All that’s missing is search by category…maybe that’s next!…why not, as people categorise their blog posts, I’m sure they could take a leaf out of Technorati’s book, IceRocket has joined in.
Also I’d like to see a citation link under every hit,
Check out the FAQ.
Haven’t consulted my RSS reader for a day, let’s see if there is already a bookmarklet.














RSS Feeds
If the hype around RSS and Blogging wasn’t already insane it seems to changing up a gear. First off it was the launch of Google’s blog search then the mysterious Flock Browser which seems to have integration for just about every peice of blog softwar…
Trackback by Andy Hedges — September 15, 2005 @ 8:46 pm