Sunday, 7 March 2010

Document formats

I have recently begun playing with the Office 2010 beta in an effort to understand the implications of a migration from office 2003 direct to 2010.

The problem of migration is not one of document formats per se, but rather of compatibility for things such as word and excel macros.

Document formats are a done deal. Like it or loathe it the 97/2000/2003 doc and xls formats are the de facto standards for document interchange. I work quite happily, never using word or excel, but instead using Open Office and Google Docs for most purposes. In fact to be honest, as far as word processing goes, I find that if I want a local application AbiWord does everything I need – fast, lightweight and responsive.

Couple this with Google’s recent purchase of DocVerse, a company which made its money from enabling the easy sharing of Office documents across the web and we start to see a direction – and one that’s important for cloud computing and collaborative working – while not connected these two seem to go hand in hand as the one promotes the sharing of resources and simplifies the mechanics of the other.

Get away from file systems and start having collections of stuff, like pictures here, writing here, notes here, work related material there and so on one reaches the point where it actually becomes irrelevant where things are. Essentially a sharepoint style workspace abstraction - rather than the use of products such as sakai and moodle to provide a sharing and collaboration platform with their lack of tool and desktop integration.

So with Gladinet I have my Google Docs folder, my Windows live filestore and my home drive at work all connected, and I have dropbox as effectively an online thumb drive of live crucial documents.

And while it’s not quite all there yet, it means that I can work on a document on my mac or XP machine at work, on my Windows 7, Mac or one of my Linux machines at home, or indeed on my little travel computer from a coffee shop in town. All can use Google docs, all can upload to the skydrive, even if they can’t all mount it directly (yet).

This of course means that for the sharing of documents across platforms the document format is crucial.

In an ideal world this should be xml based, and should probably be ODT as it is well known and passes the archivability test of having a range of independently developed applications being able to make use of files created in that format.

Important, as it means that in principle a Martian with a compiler could write a program to display the text, the formatting and the associated metadata accurately.

Well unfortunately it isn’t. That battle has been lost long ago due to Microsoft’s success in selling earlier versions of Office such that the world and his cat used files in these formats – and created a quasi monopoly for these older office formats by disposing of the opposition. In fact their success with the older office formats was such that it inhibited the takeup of their newer OOXML formats due to the chaos that would potentially ensue from having Sales on 2003 and Engineering on 2007.

However, these older office formats are now well known, and the success of applications such as Open Office, AbiWord, Google and Zoho docs in rendering them means they pass the archivability test.

It also means that we are stuck with them for archive purposes, meaning that it is possibly time to realise that, imperfect as they are they are also de facto preservation formats and while applications such as Xena that seek to normalise them are laudable, they are really only performing a conversion from the de facto to the de jure.

It also means that anywhere that wants to start a collaborative environment understands that the thing about sharing (and collaborating) that you can’t predict is what sort of computer your collaborator has, and hence what software they have access to. In short it means that you need to agree a set of acceptable document format standards, not application standards as in the past.

1 comment:

dgm said...

Microsoft also seem to be getting into the sharing game promoting the use of a Windows Live Skydrive as adding functionality to Office 2010, and suggesting that your sharees use the Office webapps to access the documents from anywhere - again the key is a common document interchange format ...