Thursday, 30 November 2017

Preserving spreadsheets

Spreadsheets are used in lots of ways in research, and that means that we need to think about their preservation as part of the long term preservation of data.

And this is actually more complicated than it sounds - as demonstrated by a recent post on preserving Google Spreadsheets.

The best preservation practice really comes down to how the spreadsheet was used.

If we are using it passively, ie as a way of recording data in the way that I’m doing so on the Dow’s pharmacy project, export as comma separated, tab separated etc, is the way to go, and also circumvents the Year 1900 problem in excel. Basically you just get the characters and that’s all you want.

And this is great for survey data, botanical field data, archaeological data and the rest - a true lowest common denominator format.

And that’s a very good thing as if you have any pre-1900 dates in your spreadsheet exporting from Excel to Libre Office calc on the basis that calc’s .ods format is open, and non proprietary can cause problems.

And that’s the problem with spreadsheets, if there’s any calculation you need to ensure that the exported version correctly reproduces both the calculations and the results, which is a complicated problem.

It would probably be simple to start with a product that uses an open format - such as Gnumeric or Libre Office calc and then export the document to Google Drive, Dropbox or OneDrive for sharing rather than start with an online spreadsheet - and if you need to start with an online spreadsheet, Microsoft’s online version of Excel might be a better departure point due to it’s compatibility with the stand alone version of Excel giving a better chance of conversion to an archival format ...

No comments: