Can RSS wrapped in OPML be the answer for blog archiving, sharing blog data, and displaying a directory type index for your blog.
Archiving blog content (blog posts) has been emerging again, and I’m hoping OPML can be the container.
The question I ask is whether the OPML of a blog will have blog post HTML permalinks as items or use RSS
…the problem, as Paolo points out is:
“What I think is missing in today’s weblogging infrastructure is a way to point to the xml version of an item, basically having a URI for an RSS item similar to what a permalink is to html version of every post.”
RSS post permalinks
Currently feeds don’t store every blog post, there is a limit…although feedcatch is meant to be an RSS archive.
Even if we have an RSS archive, each blog post needs an RSS permalink…this way if you collated all these blog (RSS archives) you could search the aggregated RSS of these blogs and each result would have its own permalink.
If each blog had RSS permalinks at the blog post level, the RSS version of a blog could be a raw blog (post content only) without HTML, especially if it looked as pleasant as the feedburner feed version of your blog.
This would make standalone feeds such as pu.blish very effective as they don’t have a HTML page, all they are is a feed URL and an admin area to post.
If this feed URL looks pleasant enough and you can point to permalinks, then this is the new simple blog (content only, just a stream of unorganised posts).
I guess an outliner could be a simple blog, but then you need a URL for each item…you can get one if you use OPML editor, this is for an HTML version of your outline.
Post permalinks - HTML or RSS
In this post, I ponder the difference between an OPML with feed items or HTML items for archiving purposes…this was also previously examined in an earlier post (See topic Hmmm).
In the comments of another post I query how a third party OPML, like Utils, for del.icio.us (where the OPML contains HTML links) is kept fresh.
When I generate an OPML for my del.icio.us account via Utils (soon at the tag level) each item or bookmark is a HTML link…at the moment the tags doesn’t serve as a folder hierarchy, but they could do or each bookmark could have a sub-item listing the tag names and even linking to them…this OPML URL will contain an archive of every bookmark in my del.icio.us account, it’s an outline version.
When I add a new item to my del.icio.us account this will (hopefully) magically be included in the third party OPML URL, provided Utils checks my del.icio.us account every hour and adds any changes.
If del.icio.us had its own OPML then this would happen instantly, you would have your del.icio.us account as an outline archive, even at the tag level.
Instead, if I took my del.icio.us RSS and grazed it in an OPML Reader like Grazr, all I would see are the latest posts, not an archive of all the bookmarks…if I entered the Utils OPML of my del.icio.us account, I would see an archive for my whole account.
If I entered the RSS of my blog into Grazr again I would see just the latest items and not an archive, but what I can do is click on a blog title, and then read the content within Grazr…but this is only because Grazr is fetching this information.
What we want is all the data of our blog post content as an archive, an organised archive by category/tag/date would be even better.
Utils plan is to be able to OPML anything, well lots of things anyway…so my hope is that I can create an OPML for my blog content, not just titles, but body content as well.
Even without blog post RSS permalinks you could still offer an RSS archive feed for your blog, as well as your usual RSS feed, but this doesn’t help us with a portable blog archive or back up content for our blog.
So the question is will this be an OPML made up of blog post HTML permalinks or blog post RSS permalinks (if possible)…what’s the difference anyway if you can graze OPML’s just like grazing feeds.
NOTE: Although, an RSS feed for every blog post is a bonus, you could promote the feeds of a few blog posts on the homepage of your blog…people could subscribe to these posts as they may be the type of blog posts you are always updating. That happens with some of my posts if they are a “list of tools” type post…although some blogs have pages that can be used more as webpages than blog posts.
Here is an older post, Back up blog via OPML, where I point to a blog that has its own OPML URL…I asked where do I get myself one of these, maybe Utils is the answer.
At first I thought that the OPML would be made up of the blog category feeds, but instead each item in this OPML is a HTML link of each blog post, and the blog categories act as folders, see for yourself.
But still this isn’t archiving the content, it archives just the blog title’s, for example you can’t graze this OPML in grazr to the post content level, as it is not a web browser, although you could do this in Bitty
…but the point is not grazing, it is archiving even the HTML of the body of each blog post.
I still ask, where do I get one of these?
I think lots of people will be interested in a portable version of their blog, an outline back up…see Lisa’s post The Blog on My Keychain.
A bonus of this, as Lisa explains, is manipulating the content when you are offline, arranging and sorting posts, even changing the content, and when you go online your blog refreshes…now we are talking about web2.0 where data can be tranferred very easily, open and obliging formats.
The post, We need a shared blog format, is exactly about this, some excerpts from the post:
“If someone spends six months working with Wordpress and decides that it’s not the CMS for him, then he should be able to move to a new package more easily than he can now […] it should be possible to have a generic format for content and comments that could carry from one package to another.”
Blog post repository
RSS Archives & the grand unified blog repository is a post that furthers this concept of blog post archiving into the aggregate arena.
If all blogs had an OPML URL, then it could be aggregated into an OPML search engine…blog engines wouldn’t need to scrape HTML (this is annoying as this process also scrapes non-post stuff like sidebar links), all the correct HTML would be in each blogs OPML URL index.
Alternatively, if each date or category had an RSS archive, then an OPML could be made up of RSS items instead of HTML items…we could also read the entirety of a blog in an OPML Reader.
But the problem is that you still need a blog post RSS permalink, so each post can be archived…especially to search the repository, how would you present a single blog post in the results page if it doesn’t have a RSS permalink for each blog post.
NOTE: The limit of RSS is that not all feeds contain the whole post content, that may be just summary or title feeds.
I’m not sure about this techie stuff, whether it’s HTML or RSS permalinks for blog posts…but if each blog wrapped its post content in OPML (not just titles, but body content) then we would have portable blogs/data.
And if we aggregated all the blog OPML’s in the blogosphere we’d have a searchable blog repository (different than the spidering engines like Technorati)…each blog OPML could be tagged in a folksonomy or appear as an inclusion in an OPML directory
…you could share, discover, browse, search and graze blogs and blog content from within an OPML tree.
Create your own blogosphere, no requirement to spider for the latest content as the OPML URL is dynamic, and past content is always there if these OPML’s contain an archive of every blog post and its content.
Also check out tOKo does Movable Type, not sure if you can browse the body content of each post, but this is an amazing export outline of your blog, by date or topic or keyword, inclusing comments and trackbacks…handy for the sidebar.