Friday 4 April 2014

Big archives and data liberation ...

There is a lot of chatter around big data and how it is going to save the world.

Well it is true that some of these incredibly large datasets have value outside of allowing supermarkets to know which brand of cat food you, or rather your cat, prefers, they also tend to obscure another aspect of the data revolution.

Rather than just very large datasets it is the volume and diversity of data sets that have been put online as a result of various digitisation projects.

For example, I have found online accounts and pictures of a project to help orphaned Basque refugee children from the Spanish civil war that my mother worked for, and the details and records for the sinking of the ship my father was second engineer on during the retreat from Singapore.

It is not that these things were unknown - but even 10 years ago to find the records would have involved writing to archivists, a trip or two to the archives themselves, requests for material to be copied, etc. It would have taken a lot of time, and given that I live 20,000 km from some of the archives in question, damn near impossible.

Today I can find most things from my desk, starting with google and perhaps a few likely resources. Just as I did to find how news of Linclon’s assasination reached Australia

Such is the power of online that material that isn’t online effectively does not exist - which is undoubtedly not good, as it means that non digistised sources will tend to be ignored, skewing scholarship, but we also need to recognise the power of online access to resources and how they empower people to not only ask questions but find answers, even if the questions they have are about early medieval cats, rather than something profound.

In 1984 George Orwell came up with the memorable line
He who controls the past controls the future. He who controls the present controls the past.
Scarily true when one looks at the rewriting of history to justfy current actions. By making digitising resources and making them open and available, it reduces the risk of manipulation, something which also goes for research data as well ...

No comments: