Friday 27 June 2014

25 years of the internet in Australia

Australia’s been connected to the internet for twentyfive years now.
I of course don’t remember this, as I wasn’t here - I was working in a university computer centre in the UK, and in 1989 it was still all DECNet and Coloured books protocols.

(The UK had invented its own set of networking protocols, and connection to the rest of the world was either via complex email address translations or strange non interactive ftp incantations - something that I was a dab hand at - sending files to India via usenet etc…)

A couple of years later (I’ll say 1991, but actually I can’t remember) there was Project Shoestring which was a pilot migration to TCP/IP which recognised that Coloured Books was not going to make it globally and we’d better all move to TCP/IP - something that made the Unix people very happy (and Macintosh users - they could use Eudora for email and send each other BinHex encoded attachments).

Now we still used to run a student advisory service - this was a hangover from the days of batch programming but essentially what it was that one of the programming team sat in a booth and answered user queries as to why their batch job had gone stupid.

I was officially an analyst programmer which meant I had to do advisory even though I was singularly useless at it - rather than Fortran coding I knew about network transfers, document formats and these pesky new things called desktop computers. My principal contribution to human happiness at the time was explaining to US exchange students was how to email their girlfriend/boyfriend at somewhere.edu or their mum or dad who had a compuserve account.

Anyway, one day shortly after we started a TCP/IP service, a visiting Australian academic turned up asking if we could help her access her email - she had thought to bring the numeric address of the mail server, so I fired up a VT100 emulator, connected to a terminal server, typed c tcp {server ip address}, and after a few seconds a login banner appeared, typed out at what looked like 300baud, she logged in and fired up elm to read her mail.

Clumsy for sure, but there it was, a connection across the planet …
Written with StackEdit.

Thursday 26 June 2014

E-readers are (probably) dying

E-readers, as in dedicated devices for reading content are on the way out.

How do I know this - well, just by looking about. I’ve occassionally blogged about people’s reading habits on public transport in Canberra and even Singapore, but yesterday I was in Sydney for a meeting, and did something I hardly ever do these days - caught a peak hour train. The train was one from the CBD out past the airport to whereever the train goes once it’s past the airport, and being peak hour it was pretty full.

In between gawking at the sunset over the the Harbour Bridge (tip: try Circular Quay station at sunset for an excellent view) I looked around at my fellow passengers. The train car was pretty full and around three quarters of the people were reading something. A few oldies with paperbacks, but everyone else was using a tablet or a smartphone. Interestingly, young people of Asian experience seemed to mostly read on smartphones, while their western counterparts seemed to prefer 7” tablet. (I agree, they could be doing something else, but of those I could see clearly, what ever they were doing involved screenfuls of characters - which kind of looks like reading to me)

No one I could see was reading using kindle or other like e-ink grey screen device.

Now, e-readers have many virtues - especially Kindles - buy your book and it lands on your reader as soon as you have a wireless connection - long battery life and the rest - in fact they are very good at what they do. Proof of the pudding is that as well as my kindle I still use my Cool-er for reading public domain books from gutenberg as part of my unarticulated informal research into the nineteenth century colonial experience. Yes, of course I could use my kindle for this, but having two readers on the go means I can separate reading for interest from reading for fun - a bit like having two books on the go at once.

And I’m sure that a great many people will carry on using their readers, but if you’re carrying one device round with you, you’re more likely to carry a tablet, because of its versatility …

Written with StackEdit.

Monday 23 June 2014

Curating Legacy Data

Data is the new black, everything seems to be about data at the moment and it’s desperately trendy. At the same time there is an entirely laudable movement towards researchers making the data that underlies their research available, for all sorts of reasons, including substantiation and reuse and recombination with other data sources.
This sometimes gets conflated with Big Data, but it should’t - outside of a few disciplines, most experimental data is pretty small, and even quite large sets of results will fit comfortably on a twenty dollar usb stick.
It’s important to remember that this is a recent phenomenon - only in the last five years or so has cloud storage become widely available, or commodity hard disks become large and cheap.
Before then data would be stored on zip drives (remember those) CD’s, DVD’s, or DAT tapes - all formats which are either dead or dying, and all of which are subject to maintainance issues. Basically bitrot due to media degradation.
Even if you’ve stored you data on an older external hard disk you can have problems - my wife did just that, and then lost the cable to a four year old external disk - it of course turned out to be a slightly non standard variant on USB, and it took us a lot of searching of documentation and cable vendors (including a couple of false starts) to find a suitable cable - when is a standard not a standard ? - when it’s a proprietary one.
Recovering this legacy data is labour intensive. It can be in formats that are difficult to read, and it can require conversion (with all that implies) to a newer format to be accessible - which can be a special kind of fun when it’s not a well known or well documented format (nineteen nineties multi channel data loggers come to mind).
So, what data should we convert?
Well most scientific publications are rarely read or cited, so we could take a guess and say that it’s probably not cost effective to convert the data underlying those - though someone did once ask me if I still had the data from an experiment I did back in the nineteen eighties - turns out they were having difficulty getting regulatory approval for their physiology study and thought that reanalysing my data might give them something to help their case. And I’m afraid I couldn’t help them, the data was all on five and quarter inch Cromemco Z2D disks, or else punch cards, and long gone.
So, what probably a legacy data curation strategy should do is focus on the data underlying highly cited papers - it’s probably of greater value, and there’s a chance it might have been stored in a more accessible format.
However even recovering data that’s been looked after still has costs associated with it - costs to cover the labour of getting it off the original source media and making it useful. And these are real dollar costs.
From experience, getting a half dozen nine track tapes read costs around fifteen hundred bucks if it’s done by a specialist media conversion company, and the administration, shipping, and making useful phase probably another fifteen hundred, less if some poor graduate student can be persuaded to do the work, but still is a reasonable chunk of money, and money that need to come from somewhere.
So, who pays, and is it worth it?

[Update 26/06/2014 : Notes of a meeting in Sydney on this very subject ...] 
Written with StackEdit.