Friday, 29 October 2010

Resourcing academic computing

Have just been to a rather interesting presentation on the ARCS data fabric.

The ARCS data fabric is in essence an initiative to build a shared storage cloud in Australia for use by researchers and also to combine this with a grid based execution environment.

I have previously written about how people tend to assemble their own toolkits of resources, and how this has some odd effects, such as wikidot becoming a wiki provider of last resort in academia. It's also the case that any toolkit of resources should include some offline storage for key documents if for no other reason that hard drives break and computers get stolen. And often an extra benefit of remote storage is that they have a permissions model meaning that you can share some of the content of your remote store with friends and colleagues, some with the world and keep some private - in effect owner:group:world. And one example of such a service is windows live's skydrive, which is bundled with windows live accounts, and which is free at the point of delivery.

Now interesting things about the ARCS data fabric is that it uses webdav and provides very similar sharing functionality to Windows Live skydrive (and the same amount of storage as default) and is currently free at the point of delivery.

Unlike the ARCS service Skydrive does not come with a desktop connector but Gladinet will sell you one that will link to a whole range of storage providers, including webdav hosts and windows live, allowing cross mounts - which for example allows you to move content from Google Docs to Windows Live.

One of the great debates we always have in university computing is whether to outsource email, and if we do, whether to choose Microsoft or Google.

Ignoring theology, the differentiator has always been that Google provides better tools in the form of Google Apps including shared document editing, and Microsoft provides better storage in the form of skydrive. It's also true to say that Microsoft is more windows oriented, but with the new Office web applications they are becoming much more agnostic about such things and that their web apps are now as functional and environment-agnostic as Google's

So, purely for the sake of argument, let us say we outsource email to Microsoft, and use the money we save to licence an appropriate connector to mount skydrive on the desktop. Now we know we don't save a lot of money getting rid of student email as we know students increasingly self outsource and use a webmail service for email. If we don't also outsource staff email we end up having to provide almost as much infrastructure as before.

So there is no great win in only outsourcing student email. But remember that going to Microsoft gets you skydrive. And this could be a radical opportunity, possibly too radical, and decide to no longer provide student filestore and tell them to use skydrive.

Now, we do save money by not providing student filestore, providing performance is at least as good as any existing student filestore, and that we trust bigM not to lose any data. Student data lives in the cloud and ideally is accessible from any location on any platform via a browser at a minimum.

Now the endpoint of this is we end up providing a service very much like the ARCS data fabric to students and by implication, all members of the university, and even these days 25GB is a reasonable amount of online storage.

So, in this scenario, do we see Microsoft as a suitable provider of storage for work in progress researchers and do we then see initiatives such as the ARCS data fabric turned into providers of specialist storage for either large datasets or to get around legal/jurisdiction related problems with sensitive data?

And does that mean we increasingly see a landscape of outsourced filestore?

Thursday, 28 October 2010

Microsoft rolls over

Like a lot of people in the technology game I have multiple email addresses and blog sites, one of which is a microsoft live site which I use primarily for email for those people who never update their address books and a skydrive account which I use primarily for always on filestore to access content from wherever.

Bundled in this or rather was a service called spaces, which included a blog offering, which I must say was not as slick as wordpress or blogger, and which I hardly ever used.

But I might do in the future - I've just received an email from Microsoft that reads in part:

"Important changes are coming to your Spaces account that affect you and will require you to choose an option that is right for you. We are very excited to announce our collaboration with a premier and innovative blogging service,, to offer you an upgraded blogging experience. We'll help you migrate your current Windows Live Spaces blog to or you can download it to save for later. You should know that On 16th March 2011 your current space will close."

Which is kind of interesting. You could read it as Microsoft saying "we're having difficulty getting uptake with this blog thing, so we're going to can our product and outsource it".

And of course the two biggest providers are Wordpress and Google, which kind of leaves them with only one place to go.

Anyway, I've migrated my Microsoft blog, such as it is and you can find it at I wouldn't hurry though...

p-books versus e-books

Last night, for the first time since I fell in love with my e-reader ten weeks or so ago, I started reading a paper book, ie a traditional paperback, in fact a reprint of J F Ackerley's Hindoo Holiday.

And despite being a new edition, it is a reprint of an earlier version - when one looks at the printing it has been photographically enlarged to fit the newer 190x130 paperback format rather than the original 177x107 format, and clearly photoset from an earlier printed version.

What is very noticable after using an e-reader are the minor printing defects, eg occasional broken ligatures and other imperfections introduced by the printing process. Text is slightly less contrasty and slightly easier on the eye, but at the same time slightly more difficult to read in lower light situations.

But the thing that is most noticable is how much more comfortable a proper book is to read, purely due to not having to hold the e-reader rigidly in one hand in order to press the ipod-like next page button.

This may of course be easier on other models of e-reader, and a redesign of the next page button, say as a squeezable edge, might solve the problem.

I'm also being unfair - having been an avid reader since the age of eight I have almost half a century's experience of reading traditional books and ten weeks with one particular model of e-reader, and in time one would undoubtedly adapt - try going back to driving a car without power steering, or using a manual typewriter to discover how much one's technique changes over the years.

That said, I'm still more than happy to read books on my e-reader and have a pile of public domain books I want to read. I've also got a pile of paper books to read. It'll be interesting to see how my reading habits change over the next year to eighteen months ...

University Cuts

I've been studiously avoiding commenting on the impact of the UK's Comprehensive Spending Review on university funding, particularly on the Arts and Humanities.

Despite spending eight years studying in one way and another and twenty years working in UK universities (including a stint as an AUT branch secretary), I simply do not feel that after seven years in Australia and only an occasional visitor to the UK I have the right to comment. It's like going back to somewhere you used to live - there are inevitably changes, some better, some worse, and some just plain confusing.

However, one thing I feel strongly about the move to a 'user pays' model, and Australia is probably closer to that than the UK, is that it results in a set of imbalances in the system as students start to boycott 'difficult' subjects and subjects that are seen as unlikely to enhance employment prospects.

And so the hard sciences and the arts courses wither away and we have the rise of business studies and the like. Now, despite my occasional rants on the subject I'm not against business degrees per se. When I first started having to look after projects and purchasing contracts I could definitely have used a whole range of business skills, in budgeting, contract management, contract law, project planning and the like.

I do not however feel that Business Studies is a stand alone subject. Like a number of IT courses it is an applied enabling subject that allows you to be more effective. Not more original or innovative.

Such courses do not give you the depth from studying a complex subject in depth, be it molecular biology or the history and archaeology of the near east.

Studying complex and difficult subjects, where there are no right answers, teaches you to think, assimilate often contradictory and complex information, analyse, present, argue and the rest of it.

Of course, in a user pays environment there is also an expectation that students will get a decent degree at the end of it, and with the expansion of higher education, there are increasing numbers of less able students. The result is of course grade inflation and a drift towards safe and easy subjects.

Elitist? Yes. To continue to develop and innovate societies need to produce thinkers, movers and shakers, and to do that we need to get the best out of people, and to do that we need an environment where people can be stretched and taken in different directions. And to produce that takes money, and a tolerance of apparently useless subjects. Not for what they do, but for what they provide in the way of stimulation ...

Monday, 18 October 2010

ipads and excavation

I recently tweeted a link about the use of iPads at Pompeii.

As a veteran of many discussions with archaeologists as to what sort of machine would work best down a wet hole, the answer always used to be something cheap and disposable, preferably with decent battery life. Linguists doing field work in NT and PNG have similar problems, as do botanists, anthropologists and the like, but archaeologists always seemed both to be first out the gate and to come up with the most extreme environments for data capture and recording.

Until digital technology became all pervasive other disciplines tended to stick with analog technologies as they tended to be just that little bit more robust.

Prior to the netbook revolution, the answer to what computer works down a muddy hole always seemed to be second user thinkpads or macbooks, or if the budget would stretch to it, Panasonic Toughbooks.

Post the netbook revolution, cheap machines with ssd drives seemed to be the way to go, even if they were still prone to damp and dirt and dust getting sucked in. The iPad seems to be a logical evolution - a touch screen means fewer risks of keyboards getting clogged, and the sealed design of the iPad with few if any holes helps guard against damp and dirt sneaking in.

What would be interesting is the attrition rates of netbooks against iPads.

For example, here in Australia, an iPad costs a little over $600 and a (non SSD) Samsung netbook a little under $400. Which basically means you can wreck three netbooks for every two iPads you lose.

And of course the netbook is a general purpose computer (which means it can be used for things other than data capture and data entry) and connecting usb devices and printers is a damn sight more easy than with an iPad.

And tablets for data entry in hospitals have been around for years, and while expensive these ones really are rugged and proof against fluids etc.

So, gee whizzery aside, does the iPad provide a cost effective alternative to data entry for the field sciences, and by implication to the classic field notebook?

Ubuntu 10.10

last thing Friday I built myself an Ubuntu 10.10 vm on top of virtual box running on a mac.

It just worked - ok I had to add a couple of personal favourites like kwrite and abiword, but they installed neatly and updated menus correctly. No insertion of fingers in ears or dancing round rowan trees required. Basically a Windows 7 or OS X style experience

The next stage would, obviously, be to build a real machine. but so far everything looks very good, Nothing to carp about at all ....

Friday, 15 October 2010 - a first look

I've just had a very quick first look at, Microsoft's Facebook authenticated competitor to Google Docs. Basically:

  1. Authentication is via facebook. There appears to be no obvious way to link to an existing windows live account
  2. Editing feels to be as responsive as Google Docs, and perhaps a little better than Zoho
  3. Documents can be printed on a local printer
  4. Documents can be shared, but only with existing Facebook friends, but each document has a public url that can be given to other people to allow them to access documents providing they have a Facebook account. I havn't tested what happens - my test document is at
  5. Documents can be downloaded and opened locally with Office in a single click operation. There do not appear to be other export/download options
  6. The interfaces are Office 10 like with a tabbed structure
Would I use it?

Probably not, personally I'm happy with Google Docs for my lightweight wordprocessing and spreadsheet needs, but given that with some of the 500 million Facebook users using Facebook to the exclusion of other services it's an interesting and useful edition to the Facebook ecology.

I imagine we'll start to see documents created in Docs being submitted as part of student assignments etc in due course ...

Thursday, 14 October 2010

Source Code and data archiving

Interesting article (pdf doi:10.1145/1831407.1831415) from this month's 's Communications of the ACM on whether scientists should release their source code along with experimental data for review.

It's my view that they should - large experiments in the disciplines of genomics, astronomy, physics and the like often produce terabytes of data which is unmanageable by standard processing techniques, often meaning that data is often filtered at the instrument level, and sometimes by custom built FPGA's.

And this very simply means that there is a risk of introducing artefacts due to errors in the gate array code, meaning that we are looking at the risk of producing chimeras, ie results that actually aren't there and producing the digital equivalent of cold fusion.

This is a risk in all disciplines with the preprocessing of the data, and here the source code is simply part of the experimental method and hence should be open to review. The same also refers to code designed to process the results. Errors can creep in, and not necessarily due to coding errors on the part of the people carrying out the analysis. Both the pentium floating point bug and the VAX G_FLOAT microcode bug could have introduced errors. (In fact the latter error was noticed precisely because running the code on a VAX 8650 gave different results to running it on an 8250).

And this introduces a whole new problem for archiving:

If we archive the source code can we be sure that it will run identically when recompiled and run under a different operating system and different compiler?

It should, but experience tells us that this won't always be the case. And emulation, while it helps, is probably only part of the answer.

Wednesday, 13 October 2010

student computer labs

Despite my periodic outbursts against the concept, we still have them, which kind of begs the next question:

are we doing it as well as anyone else?

To answer that I tried a novel approach - flickr. I searched flickr for photographs of computer labs (well students spend a lot of time there and it's something marketing types like photographs of to show how good UofX's facilites are, so the chances are that we'll get some representative images)

On the basis of my search I can say
  • computer labs are either funky or grungy
  • there are four basic designs:
  1. traditional, everyone face the front designed for chalk'n'talk
  2. traditional, computers round the walls - walls make it easy to provide networks and power points, and if you want people not to talk to each other (aka individual study) it's not a bad design
  3. small group without dividers - aim is to provide facilities for group work and usually has a shared table type design
  4. small group with dividers - designed for individual study but without the regimentation of options #1 or #2
There are some variations - like funky long benches with offset sections to try and bridge the requirement of small group versus individual work.

What I havn't been able to search is what I call 'beanbag labs' ie semi formal study areas where students can bring their own laptops, plug into wifi, and work between tutorials. I suspect that these are provided as open bench areas or as clusters of traditional carrels in the main, but that institutions with newer funkier facilities provide a range of options from the traditional reading room bench to the more comfortable chairs and beanbags type area.

The two sets of pictures (has to be two as flickr only allows you 18 pictures in a gallery) can be viewed at:

all images are copyright their original owners unless otherwise stated.

Kindle singles

More on the e-book theme: Amazon have just announced a shorter document format called 'Kindle Singles'. In essence it's a digital pamphlet format, and basically, as a reflowable format it makes reading short form documents, currently almost universally distributed as pdf's easier on a kindle.

It's been my experience that working with pdf's on an e-reader is a pain - pdf's are designed to display as per the printed page, which is not necessarily what you want on an e-reader - at its simplest you want the document as a single column, not a two column format.

And while the linearity implicit in e-readers is an irritation, it's probably less with short form documents, where it's possible to hold the document in one's mind as a whole.

TNW is probably correct when they suggest that the short form Kindle publication is a play for the college market - it would open up the e-reader for things such as electronic reading bricks, and if journal publishers were to adopt a reflowable format, make possible the easy reading of research papers on an e-reader.

This has of course implications for digital archiving. At the moment most self archived scholarly articles are in pdf - we should really be treating pdf as a derived format and using something such as TEI to faithfully represent the document as published and then provide options to have it as pdf for later printing, epub for reading on an e-reader etc etc ...

[Update: one of the other nice features about using TEI or some other super format for contents storage as opposed to content delivery is that it makes content delivery in alternative formats a simpler problem, allowing the simple on-demand production of books in alternate formats such as DAISY for use by the visually disabled.

It would also allow the text to be fed into a typsetter (or an espresso book machine) to allow the on demand printing of texts in bookstores for these people who still need a printed a copy. Doing this gets round the delayed gratification problem of ordering books from online retailers and then waiting n days for the postal service to deliver your book]

Tuesday, 12 October 2010

blogger bizarreness

For some reason the blogger editor has started giving me the option to edit documents in Indian scripts - and I cannot work out how to turn it off ...

Monday, 11 October 2010

That man Thucydides gets everywhere ...

Actually he doesn't, but by pure co-incidence there were two other similar posts about the same time as I posted my blog article on Thucydides and the e-reader:

there may have been others, I simply don't know as part of my holiday regime is not to read blogs and tweets (and not to post either, the Thucydides post was written on the ferry somewhere between Venice and Patras in mid September).

However both these posts encompass some simple truths

  • e-readers are light, portable and can hold gizillions (well a lot) of books
  • the battery life is close to excellent
  • the screens work even in bright sunlight

greeks, pigs and oaks

On our recent trip to the Pelopennese I was struck as to just how many oaks there were - big majestic ones lining the road over the mountains between Kalamata and Sparta, small scrubby sessile ones in the hills behind Kardymylli, and of course where you get oaks you get acorns, and acorns means pigs, and pig hunting was important to the ancient Greeks both culturally and as a source of protein. (It still in in the north of Greece, and is often the only meat other than the ubiquitous goat available at rural tavernas).

Now a long time ago, long before I met J, I had a girlfriend who was doing a PhD in ethnobotany and was particularly interested in the use of nuts, including acorns, as a food resource by pre-agricultural populations in Europe, and hypothesised that these populations might have actively managed the forest to provide oak groves to aid the harvesting process, in much the same way as populations in PNG and some Australian aboriginal populations actively managed the bush to enhance the abundance of particular plants.

We know for sure that the Greeks ate acorns, there are literary references to the poor eating acorns, which would certainly provide an inducement for people to manage the forest for acorn production.

And half way up a Greek hillside it occurred to me that if this was the case, they would also have attracted pigs, making them easier to capture and kill. Now acorns are not the most enjoyable things to eat - they take time to prepare and they're pretty boring - I know as one of Sarah's predilictions was to experiment with these forgotten foods, and all I'll say is that having eaten them, I'd say they're up there with ground elder as something best left forgotten.

If you are curious there's a number of recipes on the web, or else you could try looking in a Korean supermarket for acorn noodles or acorn jelly - acorns were, and still are to a limited extent, also eaten in Japan and Korea.

Pigs of course have a different opinion about acorns. They relish them. And pigs fed on acorns taste particularly good, as the packs of acorn fed jamon serrano in Spanish supermarkets attest.

However I wondered if the presence of pigs as a resource made it worth continuing to maintain these oak forests and harvest acorns after people moved on from eating acorns as it attracted pigs and also helped the domestication process as the pigs hung around knowing that they would get acorns and other food out of season, so that instead of managing to forest for acorns they started managing the forest for pigs ...

Sunday, 10 October 2010

Push email

Just as I finally got the e-reader on our recent trip, I also finally really get push email. Basically it means that the email will get to your phone as long as its on a network somewhere, without the phone having to periodically poll the server which wastes connection time.

And of course the service can be refined to strip out graphics and the like so all you get is text on your phone.

Now, unlike last year, we had a lot of trouble with wifi access, but having email automatically update on your phone everywhere we went (except Greece and strangely France) meant we were always in email contact, meaning we could send and receive email from just about everywhere, be it emailing my niece about the introductory Farsi texts stocked by Blackwell’s in Oxford from a pub opposite (Don’t ask, the explanation is perfectly logical but tedious and complicated) or being able to show the car rental agent the booking reference number in an email on my phone (the no printer while travelling problem)

And it meant that when someone did need to get in contact they could. Basically it gives email the utility and pervasiveness of text messaging, but infinitely more useful as it’s also in your inbox when you finally get online to manage your email.

Saturday, 9 October 2010

Thucydides and the e-reader

It was somewhere over the Arabian sea that I finally learned to love my e-reader.

I’d started re-reading my battered Penguin classics translation of Thucydides History of the Peloponnesian War fully intending to take it with me on our trip to Europe.

However, one look at my already bulging daysack convinced me otherwise, and did I really want to lug half a kilo of dead tree round Europe?

So on a whim, I plugged an old spare 16MB SDcard into my laptop, downloaded Crawley’s masterful 19th century translation to it in epub format from Project Gutenberg, clicked it into my e-reader, checked it worked and stuffed my e-reader into my pack instead.

Long haul flights are boring. You eat, watch a movie, try to sleep as much as possible but the sad truth about a fourteen hour flight is that you will get bored and run out of options to entertain yourself. So as the sun rose I turned to my e-reader and sat there enjoying the beauty of Crawley’s nineteenth century prose.

I was hooked. And as long as there was some reasonable light I was hooked. In use the device was light and easy to use – so much so that in the course of the trip I’ve worn some of the paint off the ‘next page’ key.

In Venice I downloaded some other texts to read via my travel computer and continued to read using the e-reader, even in bright sun in a Greek olive grove. And the battery life was fantastic – 8000 page turns is between 16 and 20 books worth of reading, more than enough for a month’s travel.

I did take a conventional paper book with me, and I brought it back unread so converted to the e-reader was I.

And the trick of using an old SD card freed me from the need to carry the download cable everywhere as both my travel computer and home laptop have SD-card slots. (Incidentally, in London Nintendo was advertising a cartridge of public domain classics to read on the train – trying to muscle into the e-reader market).

I guess the only long term question is the availablity of copyright books in epub format. On the other hand I’m tempted to sell my soul to Amazon and try a Kindle ...

[update 12/10/2010: well, as a token of my conversion to e-books I've gone and bought my first e-book, George Roux's History of Ancient Iraq. Apart from needing to install Adobe Digital Editions on my laptop, the download process was straightforward - get the acsm token file from the bookshop, load it into Adobe Digital Editions, wait for the book to download from stuff central and drag and drop it onto the e-reader...]

Thursday, 7 October 2010


as you may have guessed from the blizzard of tweets we're back from four or so weeks in Europe - various posts and pictures will appear over the next week or so.

And yes - we had a great time !