Thursday, 31 December 2009


We'd hoped to have tomatoes for New Year but we havn't quite made it. A few reddish ones but no really ripe tomatoes yet.
However our apricot tree has risen to the occasion. Fresh apricots off the tree for the New year's morning fruit salad.

Happy New Year everyone!

Wednesday, 30 December 2009

Digital Patrimony, digital archiving, and the repository

Yesterday evening I was musing about William Dunbar's Flyting of Dunbar and Kennedie, wondering if it would be fun to investigate doing a reading in two voices as a performance, perhaps a little like the Irish Catullus.

And this led me to another thought - we're always very unclear when we talk about institutional repositories and digital archiving what we want to do. If you work in a university as I do the probable answer is something along the lines of "capturing the scholarly outputs of the university to increase access and enhance the reputation of the institution". And this is basically thought of as something like a preprint server with some indexing and metadata so you can easily find out who is working on chimpanzee tool use for example.

And this is a model that works reasonably well for the sciences - after all it's basically what scientific scholarly publishing has been doing for years.

The preprints are documents in their own right, rarely subject to modification and can be happily distributed in a non-revisable format such as pdf.

And then we turn to what these days are called the 'humanities and the creative arts'. And it all gets messy but perhaps our friends Dunbar and Kennedie can help us.

The Flyting was originally a court entertainment cast a duel of wit (and copious obscenity) between two poets of the day. Imagine it as a sort of Medieval Scots rap name calling contest. (Or as an early analog of Commedia dell' arte.)Even better imagine it done by the Baba Brinkmans' of the day. The text was written down and appeared in one of the first books printed in Scotland.

It's important as it was done in Scots and the book is an early record of upper class spoken Scots usage - before then we really only have charters and legal documents, and the language used is shall we say, a little more restrained.

Now let us say we want to digitise the work. Yes, but what do we want to do?

If what we want to do look at a representation to the book to reduce wear and tear on the original, then what we would do is take some high resolution pictures of the pages of the book and perhaps write a clever flash application to let you turn the pages. We might also accompany the images with a transcription of the text, as the original typeface is hard on modern eyes.

Oh look we've just made two objects out of one. If we add a modern translation we've got three distinct objects - the digital representation of the object itself, the text, and the translated text.

Now how should we store them? The pictures are simple - we store them in a lossless well known image format. But the other two - should we store them in a non revisable format such as pdf, or a revisable format such as epub or odt (or indeed provide an option to provide the text in a range of formats). And of course if we're treating these as scholarly inputs, what should we then do with the mp3 recording of the reading in two voices? Done as part of a language or theatre studies project its arguably a scholarly output, and part of the digital patrimony of the institution. Oh, and we edited and abridged the text to fit it into thirty minutes. Do we archive that as well, and do we archive it as a set of edits or just the final edit?

Add mixed media art works and it becomes even more complex...

Wednesday, 23 December 2009

Happy Holidays !

saw this outside of the Federal Family court building - a little pre xmas revelry I fear ...

Monday, 21 December 2009

icloud and the cloud

earlier this year I wrote about icloud. Since then things have moved on and iclouds have now launched a hosted service that's usable from a range of devices including the iPhone.

This is an intersting move as one of the problems with having multiple smart devices, computers, phones etc is that your stuff is well, everywhere. Services like Dropbox help keep individual stores in sync (as well as providing a silent source of information leakage).

The next stage is cloud based storage so that your stuff is accessible from everywhere. Doing it through a browser provides a universal access mechanism, and coupling this with virtual compute allows you to execute applications without worrying too much about local host architectures or capabilities - something that ajax heavy applications like google docs or wikidot do care about.

It's interesting - cheap host agnostic computing - all you need is a (recent) browser ....

Sunday, 20 December 2009

been bush

been off for a few days bushwalking in victoria, during which time we lived in a rather nice tin hut on the edge of the bush, and very much off net - no internet connection, no 3G, although strangely enough GPRS worked allowing us push email - in effect it meant the phone worked little like a portable email reader, but without any web capability, and our trusty travel computer was simply a kilogram of nothing. While I'd thought of buying a prepaid USB 3G modem, I simply hadn't got round to it, not that it would have helped.

Other than that we had to be so last century - watch the tv news to get the weather forecast, read the Age ( and as always wonder why it is so much better a paper than the Canberra Times, despite them both being owned by the same company).

And for a few days it was fun to be disconnected from the illusion of being branché, of being in the flow ...

Saturday, 12 December 2009


this morning we went for a walk earlyish, say 0730, through the nature reserve above our house, where we saw a fox.

Didn't know you got urban foxes in Canberra ...

Wednesday, 9 December 2009

2009 - what worked

It's traditional to do an end of year technology review. This is my list of the things that made my life easier in 2009


excellent way to share files between home work and other computers. Just bloody works.

Asus PC 701 SD

proved it's worth while travelling overseas - reliable, light, effective. Don't leave home without one

Interead Cool-er ebook reader

despite my initial reluctance I've warmed to this. It's allowed me to store and have a collection of medieval research (ok, dilletante research) texts on hand while saving untold trees in the process, not to mention allowing me to visit the unknown corners of project gutenberg


Books from the UK at UK prices and free delivery. Incredibly good value - a least something good has come out of the gfc

Crunchbang Linux

fast light effective - the way linux used to be


By turns inane, infuriating, grabbing a handful of sand frustrating, somehow it has turned into something useful if only I could put my finger on it ...

Skype wi-fi phone

a belated mention for last year's undoubted winner. Once you've sampled being able to use skype from anywhere the wi-fi signal goes you won't look back

Tuesday, 8 December 2009


Outside of our house we have some definitely non native silver birches, which are currently wighted down with seed pods. The rosellas just go crazy for them ...

Repositories, clouds, and corruption

Digital repositories are interesting examples of file systems. Typically they consist of a presentation layer, a database, and a storage layer.

The presentation layer is the application, or applications which people use to add data to the repository, search for data within the repository, and retrieve data.

The database stores information about the individual objects (files) in the repository and the relationship between them as well as information describing the file and the contents.

The files themselves are stored in a file system, usually with unique system generated names redolent of babylonian prophets. As you only ever search for the object you don't need to know the name of the object, all that is required is that the system does. This is why, for example, files downloaded from flickr have wierd long hexadecimal names. It also means that the filestore is unstructured, and contains lots of files of similar size, the majority of which are only accessed rarely.

The filesystem is part of the storage layer.

Repositories are interesting as one typically only adds files to them, never deletes content from them, but one needs to guard against corruption and data loss. Typically this is done by making mutiple copies of the file, checksumming the files, storing the chacksum in the database and periodically rerunning the checksum operation and comparing the answers with the answer stored at time of ingest. If one of the copies is corrupt, it's replaced by copying a good file.

Typically, in the old days, one would use a product like SAM-FS/QFS to do this. It also used to be expensive to license so most repositories didn't and instead trusted in tape backups and rsync.

Of course backing up repository stores to tape is an interesting exercise in itself given that it consists of lots of small files in a flat structure - after all the database doesn't need a directory structure. This can be extremely inefficient and slow to backup. Much better in these days of cheap disks to copy several times.

And of course suddenly what one starts looking at looks like a distributed clustered filestore, like the googlefs or Amazon S3. And there have been experiments in running repositories on cloud infrastructure.

But of course, that costs money.

Building your own shared distributed clustered filestore may be a viable solution. And given that not just repositories but LMS applications are moving to a repository style architecture there may be a use case for building a local shared pool, using an application such as glusterfs - a distributed self healing system that is tolerant of node crashes.

Doing this neatly decouples the storage layer from the presentation layer - as long as the presentation layer can write to a file system, the file system has the smarts to curate the data, meaning that it then becomes easy for separate applications running on separate nodes to both write and share data - afterall all it is is a database entry pointing to an object already stored elsewhere on the system.

Definitely one to take further - other than the slight problem that while people have tried running dspace and fedora against systems such as the sun honeycomb, no one seems to have considered glusterfs...

Wikis - a use case at last ...

I must admit I've never really got wikis until recently. Anything you could do with a wiki you could do with html and web pages, after all markup is markup and neither really imposes structure.

I've always thought in terms of connections and lists and related facts - not very visual I'm afraid and I've recently started using wikis as dot pointed lists (see here for an example on the tudor ascendancy and here for one on early medieval travel) as away of organising facts and links, any one of which could be expanded out to some text as seen is this slightly more complex example.

Using a wiki this way allows you to build a more complex living document piece by piece.

Next question - ignoring its inherent funkiness, what does google wave bring to the piece that a shared wiki doesn't?

Monday, 7 December 2009

Treasures of the Musee d'Orsay at the NGA

To the NGA for a members (yes we're fully signed up supporters) view of their new exhibition of paintings from the Musee d'Orsay.

As always there were drinks - a decent dry white and an Australian champagne - and canape's while standing around chatting in the Sculpture Garden. As always it was an interesting cross section of society - those who were there to be seen and be seen, those who were there for the art and those who were there because their friends were. Dress styles ranged from the artistic with frightening lipstick or beards - according to gender - to the amazingly normal.

Despite the inherent pretentiousness the exhibition was rather good - I'd actually say better - for being smaller and more tightly picked - than the collection as exhibited in Paris last July.

A few nice Van Gogh's showing how his techniques in Arles was alternately frentic or controlled, a few Pisarro's, Gaugin from both his Breton and South Pacific periods, some nice little Seurat pointillist sketches, a Monet pretending to be a Turner and some paintings by Vuillard and Emile Bernard that I hadn't seen before. The Bernard had the same economy of line that I like in some 1920's and 30's posters - just line and colour.

Definitely worth a visit.

Friday, 4 December 2009


there's been a lot of chatter recently about NewsCorp not wanting it's content plundered by google etc etc and Murdoch's threat to erect paywalls to restrict content to subscribers.

Well, in our typical anglophone arrogance we're ignoring what is happeening elsewhere. French language newspapers are also under threat and reacting in equally incoherent ways.

This morning, I was looking for an update on the Paris museums strike. Because I'm a left leaning liberal I went immediately to liberation, typed in my search terms, found the article, found it was restricted to subscribers, and invited to sign up for EUR12 per month package including internet access and delivery of the printed paper at the weekend - mes amis - un peu de realite, je habite en Australie!

So I went to Le Monde, found the article I wanted and retweeted it.

The point being Le Monde makes its content free to casual users and Liberation does not, and that in aggregate means that Le Monde gets more contacts and can sell more online ads, and even might sell the odd extra copy now and then. And they probably don't lose that much by it.

Liberation doesn't get any of that. In fact I guess it drives people who might also have bought the printed paper on a one off basis away.

And of course there's a halfway house such as shown by the New York Times which tries to promote a reader community so that the idea that the paper is worth buying spreads virally.