Library clips

sharing ideas thoughts and feedback

June 24, 2005

Subscribe to a blog via the RSS engines??

Filed under: General, blogs, rss, search

Check this out, you can read your favourite blog via Findory

…here’s my blog…uniquely, it churns out an RSS feed for related articles.

Not only that it also shows related blogs and related articles…good stuff!

…I’ve put this on the sidebar under Abstract Index.

For the budding librarians I also noticed LISFeeds presents abstracts of your latest posts.

Another cool feature is Findory Neighbours

…here are my neighbours, I’ve added this to the sidebar under Statistics.

Then it got me thinking, how do the other RSS engines display your posts

…I consulted the RSS engines that I know of that do site or url syntax searches.

I found that Blogdigger can act as an abstract index of your blog via the author syntax…see author:(johnt)…and also generates an RSS feed.
(67 current hits)

Since Blogdigger also indexes del.icio.us (it is an RSS engine after all) my results are a bit varied as I also use the author name, johnt, for my del.icio.us account.

I don’t think subscribing to this feed would be accurate, as it is not unique enough, as many authors could be called johnt.

Blogdigger only indexes del.icio.us at the tag level, not at the user level…well that’s what it does for me, but then I’ve seen results at the user level for other people, check out this random example.

Now you get virtually the same results by using the site syntax, see mine at site:libraryclips.blogsome.com.
(75 current hits)

That’s a bit odd, if anything the author search should be higher as I use the johnt author name for 2 different services.

If you see one of the results on the previous search query, you can click on the focus link which gives you a search for the blogID, as it turns out this has an equal number of current hits, at 75…here it is, blogID:150597.

I also have another del.icio.us account that is a mirror for my blog (del.icio.us/libraryclips)
…so when you do the author search, author:(libraryclips) in Blogdigger, you get a Title index of my blog, with a RSS feed to boot (but you’re better off subscribing to the native feed from the del.icio.us account).

I thought I’d try Feedster as their fielded searching is very comprehensive, and RSS feeds are returned for every search.

As usual there is the author search, but as mentioned before I don’t think this is unique enough.

The url syntax also included hits from my comments permalinks and from del.icio.us, see url:libraryclips.blogsome.com
(184 current hits)

This is the best so far considering this post I’m writing now is my 176th post.

NOTE: the total number of hits was “1,118 results / Page 1 of 75″, but when I got to the 13th page the number changed to “184 results / Page 13 of 13″, what’s going on, these numbers are alive.

Also when I left a space inbetween the url syntax and the search term I got a different result, “1,087 results / Page 1 of 73, when I got to the 12th page it said “167 results / Page 12 of 12″.

It seems that when you click next page of results, then next page, and so on, the total number of results changes for some reason.

The site syntax (with and without the space), returned at “167 results / Page 12 of 12″ (same as the url syntax)

Too bad about this one:

“feed_id
If you know Feedster’s ID of a feed, you can restrict to that feed browser feed_id:47; we are working on a method of exposing feed IDs”

Lastly is PubSub…but I’m too tired now

…their Syntax Help page includes searching by “source:” and by “uri:”

Wow, the number of RSS feeds to track this blog are crazy (stupid idea if I wanted to track statistics of my readership, especially for business purposes, but that’s not the case here, and if it was, it’s way too late).

Hang on…I forgot about Google…you can subscribe to the RSS of latest posts according to these search queries via Google Alert.

site:libraryclips.blogsome.com
(403 current hits)

Seems to return results of the date archive permalinks, so that’s why this number is boosted, but the results are pretty clean as they all seem to come from the blog site.

inurl:libraryclips.blogsome.com
(432 current hits)

This was expected as other services such as bookmarking managers build on top of the orginal url.

Must take into account that Google doesn’t index as fast as the popular RSS engines, when I last checked it was 4 days behind for my blog.

I won’t bother mentioning the other traditional search engines that do site and inurl searches.

These are the number of ways you can subscribe to recent posts:

  • My native feed
  • Feedburner feed
  • del.icio.us (mirror account) feed
  • Findory (source) feed…this is a feed for related articles to your blog
  • Blogdigger (author) feed
  • Blogdigger (other author) feed
  • Blogdigger (site) feed
  • Blogdigger (blogID) feed
  • Feedster (author) feed
  • Feedster (url) feed
  • Feedster (site) feed
  • Feedster (feed_id) feed [coming soon]
  • PubSub (source) feed
  • PubSub (uri) feed
  • Google (site) feed
  • Google (inurl) feed
  • BlogPulse (url) feed

Oops I forget these:

  • Talkr feed [text to audio podcast feed]
  • Bot-a-blog feed [rss to email feed]
  • Winksite feed [mobile feed]
  • Messenger Alert Me [IM feed]

Then there are the category feeds:

And lastly the search feeds and link feeds, although these don’t apply to viewing recent posts.

If I’ve missed anyone or anything it will hopefully be here.

I don’t know how I got here, but in the end there is nothing as good as the native feed.

…I must conclude that the winner for the closest number of posts and quality of results was the Feedster url syntax closely followed by the site syntax.

…and that’s that!

Actually, there’s more…

[ADDED 25/07/05: BlogPulse RSS Summaries]

June 23, 2005

RSS: full-text or summaries!

Filed under: blogs, rss, readers, search

Reading this article from SEW: How Search Engines Index RSS & Why It Doesn’t Necessarily Matter has got me thinking about RSS feed discovery.

RSS feed engines seem to only index content in the feed and not from the blog/website itself.
So if the feed is only showing summaries instead of full-text, then the RSS engine is only indexing virtually the first paragraph of the blog post.

In turn this also effects your link stats, as you may have links to other sites in the later part of your post which isn’t being read by the RSS engine…vice versa, people could be linking to you in the later part of their posts, but since the RSS engine doesn’t pick this up, then you won’t know about it (unless they trackback you).

So in observing this it seems full-text versions of your feed seem important from an:

  • indexing point of view

    so when you search an RSS engine you are searching every word in the blog/website not just a portion of it – otherwise this doesn’t make for good discovery or findability.
    ….you want to think you are searching all that is available to search, that’s the convenience of a search engine, people aren’t going to assume that with some sites you are only getting results at the abstract/summaries level, they assume you are searching every word on every webpage/blog post

  • statistics point of view

    as discussed above, is not a complete picture unless full-text of feeds are exposed to RSS engines

  • convenience point of view

    I know I find it more convenient reading the whole blog post or feed entry in my reader, without having to click to the native site to read the complete entry
    …but we have to understand that people make a living out of the web and blogs, so click thru’s are of monetary importance

Now to my understanding if you don’t like reading full-text feeds in your reader, if you use Bloglines, you can set (per feed) to only read the title or a summary.

But if the feed is sent as a summary, and you choose to read it as full-text, it won’t work as the feed owner has the power.

Now to my understanding when you write a blog post you can choose to write the post in the body section ignoring the excerpt section. All this means is that the version of the blog post on the home page is the complete version, as it is on the blog entry (permalink).

You can also choose to write a little excerpt.
In this case your blog home page, will have a read more link on each blog post which takes you to the complete version of the blog entry (permalink).

This is handy for some people, as you can scroll through the home page, or a category page kind of like scrolling through a title index, well really it’s an abstract index.

But some people may prefer to read the blog posts in their entirety from the home page, instead of having to click to read the complete entries.

…you can’t please everyone!

Anyway regardless of whether you use an excerpt or not, this only effects the presentation of the content on your blog (at the actual blog site), it doesn’t effect the content of the RSS version of your blog.

Ie. If you publish full-text RSS feeds, but use an excerpt in your posts, you will still get the full-text version in your RSS reader.

I’m glad I worked that one out…I hope!

I must say, that I don’t use excerpts in my posts, which makes it slower for me to browse my posts, as I have to scroll through these long posts.

Although sometimes what I’m looking for is in the body of a post and I won’t know it’s there by reading the title or the excerpt, so reading the whole post from the home page saves me from clicking on every post to view it’s body.

…by the way my search isn’t very effective

…maybe I should put in a search box from an RSS engine
(I publish full-text RSS feeds, so I know when I search my blog I’ll be searching the full-text)

Anyway, since I prefer not to use excerpts as mentioned above, I still would like the best of both worlds if I could.

That’s what this blog has done (well sort of done)
…Dave uses excerpts in his posts, so it’s quicker than scrolling through full-text posts, but there is even a quicker option, that is, to browse via a title index (next to each category and next to each month there is a little grid box icon – this is to view just the title list in a particular category or month)

I’m yet to find a Wordpress plugin that can generate a title index (at the date level and/or category level)

The way I get around this at the moment is indexing every post in del.icio.us/library clips (this way I have a title index)
…I could make this an excerpt index, but I leave the extended field blank when I post in del.icio.us.

Coming back to the post from SEW blog, I have to finish with this quote that I found quite interesting:

”…what happens when it’s not just all the “cool kids” doing feeds but everyone doing feeds? What does feed search mean then? It means relatively nothing. It means, umm, searching the web! So banging on about search engines not indexing feeds sort of misses the point. As feeds encompass everything, the major search engines are already there.
Meanwhile, what happens when everyone is running a blog? Will blog search suddenly be so unique? Or will it be more the case that people will want “news blogs” in a news blog search, while “shopping blogs” might be in a shopping blog search and so on. Or even more likely, as search continues to go vertical, blogs of a vertical nature will be integrated within those types of results.”

June 22, 2005

SharpReader notifier

Filed under: General, readers

Someone needs to hack the Bloglines notifier so you can see the title of the new post, when the notifier pops up.

At the moment the pop up just notifies you of a new entry, it doesn’t tell you what that entry is about

…so how do I know whether it’s worth my while opening Bloglines to read the new post, when I could just be told in the pop-up.

The reason I’ve noticed this is I downloaded Sharpreader to try out a desktop based RSS reader (very cool reader, simple yet effective, I like it…good for beginners, but also good to keep using).

…anyway the SharpReader notifier does exactly this in the pop-up…it shows the title of a new post, and the blog it’s from, and you can even click on the title to link straight to the post in SharpReader (you can even set the time length of the notifier before it disappears).

This way you can decide whether you want to one click launch to read the article then and there, or if you want wait a while and read it later when you open up SharpReader.

[ADDED: 23/06/05 Another option would be to be able to have the notifier work at the feed level or folder level…so if you subscribe to 100 feeds, you only want to be notified of your 10 favourite feeds…this way the notifier only pops up for the feeds you have selected.]

Social bookmarks vs. free text search

A lot of people are starting to use social bookmarking tools as a means for SEO (increasing your web traffic)…tagging your own blog posts gets you double the exposure…

Particletree points this out with an experiment using del.icio.us to bookmark webpages instead of waiting for his site to be crawled by Google …read the post for the results.

Here is more from this post:

”I think the reason del.icio.us is so successful at bringing the appropriate audience to good material is because they track the changing web by using people to calculate what is essentially page rank. They get access to decent fuzzy logic for a fraction of the cost and the democracy of the system allows anyone to get their idea of what deserves face-time into the system almost immediately.”

The differences for SEO:

  • Crawling vs. pinging

    Search engines like Google take longer to index content
    Google Sitemaps is their solution to overcome this issue.

  • Pinging services enable the World Live Web or the ChangingWeb; you can keep track of new additions to the web

  • Social bookmarking and blog categories vs. PageRank

    The web community chooses the importance of a page, not an algorithm

    …and once a page has been tagged, it is visible and shared around, and can be re-tagged by many people, applying many tags (according to their way of seeing the world, so to speak - hopefully applying a popular and/or accurate tag), increasing its exposure, and maybe making it visible on the front page (most popular tags).

    Users track other users tag accounts (called an inbox in del.icio.us) on a daily basis because they like the stuff they bookmark, or people track a general tag itself (also at the user level) to keep up with the latest according to content bookmarked with that tag.

    Now if you bookmark a page for SEO reasons, someone tracking the tag you used will come across your page, and then they might tag your page with another tag, and someone tracking that tag will see it, and at the same time people tracking someones whole account or a tag within an account will see it as a new inclusion.

    So people will come across your page without even looking for it (serendipity)…as folksonomies are largely about discovery (see the end of this post).

    In constrast to Google; where you will come across the page if you are trying to find something specifically.
    Although people do set Google Alerts, or do repetitive searches daily with their favourite search terms…so they may come across your page this way, but again will your page be ranked at a reasonable level before people stop clicking through results.

    Two engines that are using tagging instead of page rank are Technorati Tags (covers the blogosphere) and Gataga (covers the folkosphere – or whatever it’s called?… I guess the tagosphere would be both of these combined).

    NOTE: see my post on Gataga searching in many fields and searching in just the tag field…according to my trials and observations.

    These tools cover a portion of the web that is savvy with current awareness, so it’s a good place to be a part of if you want traffic to your site.

Tags vs. free-text search

What subject tags try to achieve is to filter through the noise in search results, often this is called the “gray web”.

Even though the authors of websites tried this in the earlier days, it failed because of spam issues, and now it’s different as users are tagging pages (multiple heads are better than one when defining the aboutness of something)…see more here.

In Google you see results according to your search terms coupled with the algorithm (that decides on the ranking of results).
So there may be more relevant results than you think, but you can’t see them as they may be hit 2,620 or hit 100,265…you’re just not going to scroll through that many results (so this is a part of the open web, that is not invisible, it’s just hidden)..see more here.
So you’re only solutions to bring all the relevant hits to the top is too improve your search terms, make them more precise, use boolean, etc…

In social bookmark tools pages are found according to a tag or subject term, as opposed to a free keyword search, or both eg. Zniff (even though this is just for one bookmark tool)

…soon these social bookmark tools will have lots and lots of results per tag (will the same problem emerge even at the subject level, let alone the free-text level).
At the moment this is alleviated by searching for more specific tags or combining tags.

Also social bookmark tools have a popular tags home page which increases visibility

So, the more your webpage/post is seen, the more it has a chances of being blogged about, increasing your traffic again.

Socialbookmark tools don’t index all the web, only the pages people choose to tag (it’s a selective version of the web)

But this isn’t a problem for SEO, as you can just bookmark your own page, and away it goes…

Of course there is way more to SEO, but this is just about one aspect of SEO in relation to traditional search compared to social bookmarks.

This post digresses to ideas of the semantic web, where blogs, webpages, index their markup codes with more fields so to speak, eg. date, author, subject, review, job listing…this way anyone can aggregate the content and share it…structured blogging is a foray into semantic blogging.

In the end, is Google (PageRank) suitable to the lay persons searching needs, or will a subject fielded search be more appropriate.

I guess the arguement is in the accuracy in indexing the aboutness of a page…if the page is not correctly indexed according to the searcher, then they may think it doesn’t exist, that’s why free-text search is the safe option.

So the problem is; we are getting too many results

…we need more context for better precision

…but who defines the context is the question?
(tag/subject name and bookmarking the right items within this tag)

…I guess this is now a combination of the user (social bookmarks) and the author (blogs)

…another question is of controlled vocabularies

(that’s out the window in a non-domain specific, multi-discipline environment, with millions of contributors - the content is already uncontrollable, let alone trying to control labelling it all…labelling it brings context, which is want we want, but we can’t control the labels…well we have no choice, and who says controlled labels will help people search in context better than a user-defined/free-tagging system).

…as mentioned the user may have more of a chance finding something with free-text, rather than jumping from tag to tag trying to locate something (although when you find the right tag, you will hopefully find lots of items of quality, compared to free-text where you are competing with a lot of noise).

But then again, maybe free-text is for finding and tags are for discovery…maybe they shouldn’t be compared as one or the other, as they are slightly different tools.

Do we need to educate users on search terms and syntax techniques?

…or should we define webpages by user-defined tags?
(well this is happening anyway)

But then if someone searches a tag/s and doesn’t find what they are looking for, they may quit, whereas the item they were looking for was located in another tag.

The search experience needs to be intuitive.

I think our current answer is too use a bit of both.

Whether the tags become part of the pagerank (Zniff), or from the results page you show the tags applied to each hit (see comment on this post).

Meta: URL bookmark links?

Filed under: tags, tools

What I would like to see is Gataga linkbacks for all services (with an RSS) just like the del.icio.us linkback bookmarklet, which is a pop up of the del.ici.ous URL page for a bookmark.

Durl gets you an RSS of this, but only for del.icio.us, and Durl doesn’t have a pop-up bookmarklet, otherwise this tool comes pretty damn close, as it lists all the other services (linkbacks), you just have to click to them to find out.

…more

Can Gataga or Zniff do URL searches like link:yourpage.com

So you can see; how many times, by whom, and tags used for a given webpage from not only delicious but the multiple bookmark services Gataga covers.

And would this work at the home page level and still return results for all webpages within your website, as Spurl does, see here.

Get free blog up and running in minutes with Blogsome | Theme designs available here

Related Posts Plugin for WordPress, Blogger...