Friday, 16 October 2009

Archiving Blogs ...

increasingly blogs are being used as a means of scholarly communication, as a research diary, etc, etc, which means that in this world of scholarly outputs and the need to capture and archive them, if only because we know intuitively that many more researchers and academics blog than use institutional blog services.

Most either have a blog site running on a machine under their desk (I exaggerate, but there are cases frighteningly like this) or by using an externally hosted service such blogger or WordPress.com.

As we know from the JournalSpace fiasco, and as we've seen with other services such GeoCities and Macmail closing down or just plain disappearing we do need to think about the long term preservation of blog content, if for no other reason that we cannot be assured of the quality or reliability of other people's backups, or the likelihood of external free hosting providers wishing to continue to provide a service.

However I have just happened across a paper presented earlier this month at iPres09 which suggests a breathtakingly simple methodology - essentially take the RSS feed of a blog and repost it to a private instance, where the content can then be backed up in a manner that guarantees its long term availaibility.

It's an extremely neat idea as
  1. it makes no assumptions about the blog being backed up other than the availability of an RSS feed
  2. no special software or configuration is required on the host rendering it ideal for archiving externally hosted blogs where it unlikely blog authors have system level access
  3. posting to a private instance means that the archive can be kept dark until such time that access is lost to the original content
The other thing which I like about it is that if a number of people feel a particular blog's content is worth archiving we will end up with multiple copies increasing the likelihood that content will be preserved - for after all institutions may decided to stop archiving particular blogs, perhaps because a research project has finished, or the academics concerned have moved elsewhere ...

1 comment:

tenthmedieval said...

I periodically export my blog to an offline copy, which is a utility Wordpress provide, but yes, if I were to die or something, those copies would not be available to people and presumably at some point Wordpress would take my dead self off air. So it is nice to think that if I produce any kind of worth in my blog it might be stored somewhere else too. Really, of course, my will should include instructions to print it off on archival-quality paper and donate it to some sympathetic library though :-) Only way to be sure (or at least, surer)...