Wednesday, 30 June 2010

Wisdom teeth

Yesterday, at the age of 54 years and 3 months I had my wisdom teeth out, all four of them.

I probably should have had them out thirty years ago, but I’d come off my bike and cracked my jaw a couple of years previously, so they were leery of taking them out in case they did extra damage,  so they drilled them out and filled them up with resin as a temporary fix – one that lasted thirty years instead of five.

I had it done as day surgery, and apart from J missing the freeway exit and then driving to the wrong hospital – fortunately the dental hospital was round the corner on the next block but one - the whole procedure was professional and painless – even though I now know what a server feels like having an upgrade.

Laurance Tyler, a friend and former colleague, always used to say he used to dream he was a compiler, this time I felt like a server – get on the table – do you know what is being done? – that your signature on the consent form? – lie back and connect you up – just going to put a line in you – breathe this –at the end I didn’t know if I was going to come out minus my wisdom teeth or plus a new version of solaris.

Still it’s done, and strangely I don’t feel that bad …

Tuesday, 29 June 2010


Originally uploaded by moncur_d.

Freezing cold in Canberra - down to -5 on Sunday night and if anything colder on Monday night with frost still on the ground at 10.30 this morning ...

clouds and repositories

Cloud storage would seem to be the ideal fit for repositories, given that (a) no one ever knows how much storage they need, (b) response time needs merely to be adequate and (c) moving the content to the cloud allows one to outsource all the curatorial questions about storage, backup and so on to someone else in an architecture that looks something like this:

simplecloudrepositoryarchitecture where the metadata server remains in house and the object store sits in the cloud.

Certainly this is the idea behind Fedorazon and is something that would be quite easy to emulate by using the repository/collection management/contet management software of your choice and something akin to Gladinet to connect the cloud storage.

And for something like Occams, which is designed for rapid collection assembly and development, this sort of architecture makes a lot of the provisioning problems go away.

And not to be unsubtle, the large cloud providers can provide terabytes of storage for less than most institutional IT services can, purely through economies of scale. I havn’t done the maths but I’ve seen figures from one cloud provider that suggest that they are substantially cheaper than we can provide storage using conventional SAN technology. On the other hand given a repository doesn’t need SAN, and either something like Isilon, or indeed a pile of Dell Equalogics with Stornext, or Apple Xsan’s might be adequate, with perhaps replication into a private LOCKSS infrastructure for preservation/backup.

However, there’s another variation, which I hadn’t thought of.

If you look at my recent post of the Kiandra mailman you’ll see that the image is sourced from Flickr.

However, that’s not the only way of accessing the image, you can access it with richer metadata from the Powerhouse Museums server – by placing it in the flickr commons, they’ve solved curation – flickr are good at looking after digital images and can provide a range of resolutions, and the image is searchable and findable with the rudimentary metadata (tags) flickr allows, yet the conservators and cataloguers are free to add as much rich metadata as they wish.

Of course this only really works for images and more particularly images in public collections, but it’s an interesting example of a public private partnership in the repository space. And given that we know academics routinely put collections of material in flickr for teaching/research purposes, whether giving something like Occams a flickr connector would allow the capture and harvesting of that material.

But then as the name of the game is federation and we ar increasingly seing more complex archtectures such as

generic architecture diagram one can begin to envisage a mixture of cloud and local storage and a range of hybrid solutions …

Monday, 28 June 2010

a Kiandra puzzle

Well, as I said, yesterday we went to Kiandra - and sort of as a consequence I spent a little bit of lunchtime looking into the history of Kiandra, and I came across this quite striking picture from the Powerhouse museum collection of a mailman delivering mail in the snow sometime around 1900 (+/-15):

Now what's also interesting is that I also stumbled across this image (dating to roughly the same time) of a cigarette card in the NLA digital collection:

Not exactly the same image but close enough to make one wonder if the artist who draw the card had seen the print ...

(and of course with my metadata hat on if it would be possible to easily query the information to see of it was possible ... )

Sunday, 27 June 2010

sometimes the littlest things just stuff up …

well, this was going to be a happy weekend post about our really good though squelchy bushwalk from Pollock’s Crossing at Kiandra through to Four mile hut, so named as it is four miles from Kiandra – these 1860’s gold diggers were just so imaginative.

And we did have a really good walk. And I took some really good pictures of icerimmed ponds and half frozen snow on the sedges, except that I didn't notice that my thick fingered winter gloves had knocked my camera onto manual and I ended up with dome incredibly washed out images.


Anyway, we did have a good, if wet and splodgy time, and as for images all I can suggest is this one from the same locations in 2006 …304704718_96c71ee013_o[1]

Tuesday, 22 June 2010

Happy 5518 !

Yesterday was mid winter’s day in the southern hemisphere and coincidentally the Inca New Year. J and I long ago had this idea we should make a point of doing something special to celebrate it given Australia’s lack of a significant mid winter festival.

So far we’re not doing well at this. We managed something for 5516 – a small intimate dinner with some friends – but last year we celebrated by going on vacation which isn’t really what we meant by a small celebration.

Well this year I was too damned jet lagged after my dash to Providence to do anymore than drink a glass of wine and fall asleep on the sofa in the middle of the 7.30 report.

So, a day late, Happy 5518! Enjoy!

Monday, 21 June 2010

To providence and back …

Last week I had the slightly odd experience of going to the States for two and a half days, to the project bamboo workshop event in Providence, Rhode Island.

If you live in Australia and you wish to go to an overseas conference you have to resign yourself to spending a lot of time cramped in an aluminium tube eating indifferent food and watching somebody else’s choice of movie. Actually I flew United, and the food wasn’t that bad, even if being offered a pastrami sandwich at four in the morning showed perhaps a little cultural disconnect. The other nice thing about United is that they appear to have this unofficial policy of upgrading customers on longhaul flights to premium economy where possible for the domestic legs of the journey, giving you that crucial little bit of extra space to curl up in. So from Sydney to Los Angeles it was cattle class, but from Los Angeles to Providence it was civilised.

Flying into Los Angeles I was astounded when the homeland security clerk in all seriousness said ‘welcome back to America’ stamped my passport and waved me through, so astounded that I didn’t notice he’d forgotten to put the the bottom half of my green waiver form in my passport.

But I was in, and in the land where food means meat and chemicals, where turkey salad sandwiches have four slices of turkey and a single lettuce leaf, and where vegetables means fried potatoes – America basically does not do vegetables – as if it were a nation of thirteen year old boys refusing to eat their greens. I’m always reminded of George Bush saying he did not like broccoli and he was damned if he was going to eat it now he was president. And that seems to sum up America – meat and fried potatoes in ridiculous servings, but not much in the way of fruit and vegetables.

Strangely, the incessant ads for medications to treat bowel disorders don’t come as a surprise.

However, onto the workshop. There are obviously things I can’t disclose about the workshop but essentially I went to discuss our continued involvement in Bamboo and to try and get some congruence between what was happening in Bamboo with work we’re already doing on metadata and making data more accessible. And broadly, I think that part went well. I also met a whole lot of funny interesting folks who were doing some good and interesting work, particularly as regards Hubzero an online collaboration and computation tool that allows people to draw together research collateral and easily run simple analyses on the data.

One could see immediate advantages of the social sciences – imaging a range of datasets with common metadata descriptors and a range of standard tools making running and sharing analyses simple – no more having to hand craft analyses or work out how best to combine datasets.

And that’s probably as much as I can say about the work related aspects of the meeting.

I decided at the last minute to take the Ookygoo with me in preference to my office MacBook pro, and I didn’t regret the choice as it meant I could basically getaway with an overnight bag and a small backpack. As always my little Asus performed perfectly allowing me to do everything one needs to do at a workshop – tweet, email, write draft papers on Google docs (and no worry of losing any crucial drafts), easily distribute them as doc and odt, and share them, as well as skype home.

Interestingly of the four or so people at the workshop who did not have a Mac, three of them had an Asus Eee and one had a Dell netbook all with the 10 inch screen. As I’ve said elsewhere, my only regret about the Ookygoo was buying the 7” model rather than the slightly more expensive 10” model. I couldn’t tell you how many of the netbook users were linux and how many xp, but I’d guess 50:50.

Tweeting was interesting – around the presentations and discussion there was a continual update on who was saying what and people raising points for discussion via twitter – so rather than shouting out or asking questions at the end, twitter provided a live communication channel. To get a flavour of this take a look at the tweet archive at

And then, after two days of intense discussion it was over and time to come back. I couldn’t get a flight back on the Friday evening that worked so I had a little time in Providence.

Downtown Providence is distinctly tatty. Somehow I thought it would be full of nice eighteenth and nineteenth century building, nice bars and restaurants, with a cultured nightlife.

It wasn’t. Perhaps with a car one might have found such places, but round about Kennedy Plaza in the Arts precinct, it wasn’t happening. A pity really, as Providence could have been nice in so many ways.

What else to say? Another 30h, spent in airports and aluminium tubes, just long enough at O’hare to decide I dislike Chicago airport intensely, too crowded, too poor a range of eating places, and why o why do you need to show id to buy a drink?

I might hope to be mistaken as being in my late forties rather than mid fifties, but that’s as far as vanity goes. No one is ever going to ask me if I’m over 21.

The only other thing of note was United’s insouciant response to the missing exit card - ‘take a new blank card, fill out the exit section, write duplicate on it and give it in on exit’.

One just hopes that Homeland Security’s metadata schema can cope with that and match things up properly or else next time I can see myself in a grey room with a steel desk explaining all of this ….

{update 23/06/2010}

See Pete Sefton's excellent blog post for context on the ANDS metadata and party data story

{update 24/06/2010}

See Jon Laudun's posts (day 1 , day 2) for a more professional account of the event ....

Sunday, 13 June 2010

History, archaeology and metadata

At the recent ANDS bootcamp the discussion veered towards the use and reuse of data and metadata in the humanities, and I was hard put to come to an example.

I said something about using the findspots of byzantine coins and sixth century amphorae to show that trade from the Mediterranean was primarily with the west of England and revolved around the tin trade, with the celtic sucessor states swapping tin for Tunisian red among other things.

Other examples could have been Alessia Rovelli’s work on early medieval coin use (and reuse) or Nathalie Villa’s (with others) work to reconstruct a network of social obligation over space and time in the medieval Languedoc using techniques akin to cluster analysis.

There are other examples – for example I have heard of people using maps of Roman settlement sites in East Yorkshire combined with topographical and ecological data to show that the Romans did not settle near wet or marshy areas – which is interesting as one of the major activities in Roman east Yorkshire was growing grain for shipment (by sea) to feed the army on the Rhine frontier. Wheat, of course prefers a dry well drained soil.

All interesting, fascinating even, but none of them really sexy.

I then began to think about my favourite example of impermenance – Roman army pay dockets.

Broadly, between the accession of Augustus and the death of Septimius Severus the Roman Army (excluding the Auxilia) consisted of 30 legions of 5000 men. A period of roughly 250 years And three times a year (later four times a year) a soldier was paid and issued with a statement of account for how much he had been paid, how much had been docked for broken equipment, how much to his (compulsory) savings account, and how much to the burial club etc.

So the Roman army must have produced 250*30*3*5000 (or just under half a million a year or well over a hundred million during the time they did this). Assuming that they also did something similar for the Auxilia we would expect that we would have a substantial number of these.

We don’t, we have less than 10.

Which is a pity as these were semi structured documents with a predictable format and having this sort of information would let us know the actual strength of the Army, plot the impact of substantial defeats on the army. It would also let us understand what happened to the Legio IX Hispana which was once thought to have marched out of York sometime around 100AD to be massacred by the Picts – the legion never to be reformed – the story that forms the background to Rosemary Sutcliffe’s children’s story ‘The Eagle of the Ninth’ and one of the things that got me started with the Romans.

The truth appears different. It now looks, from the evidence of legionary stamps on bricks that the legion spent time on the Rhine frontier before disappearing on active service in Armenia in a battle during one of the hotter phases of the continual hot then cold war between Rome and Parthia. What crime the legio IX hispana committed has been lost.

However, the main takeaway here is that the availability of semi structured documents is what makes the construction of data sets possible.

Medieval land sale and marriage charters are another good example, as shown by Nathalie Villa. Written to a formula it’s relatively easy to extract the pertinent facts and build a dataset showing who exchanged what with whom.

Likewise one might be able to correlate the feudal obligations of landlords with rents received and start to find how much it cost to put a knight or some well armed men in the field. Or indeed to plot the rise of the monasteries and other church institutions.

Of course we tend to think about medieval because it’s immediately interesting and our own culture, but the data is in some places incomplete. However if what we want to do is validate our methodology there are some potential complete datasets to work with – the records of the East India company for one. Here was a commercial organisation that financed armies, fought wars and kept accounts, and the data was lovingly written in copperplate by accounts clerks making it relatively easy to digitise and OCR, and hence automate data ingest.

And if the techniques work on the late eighteenth/early nineteenth century records it would be possible to extend back in time, and to tackle the records of the Dutch East India company to prove the techniques work and to build a set of proven methodologies.

And we already have an example of the value of digitising old records like this. The digitisation of British naval logbooks from the eighteenth and nineteenth century has allowed us to improve the climate records we have in areas where the records are distinctly spotty.

For example, until the late 1880's with the advent of people such as Clement Lindley Wragge meteorlogical observation in Australia was distinctly hit or miss with little funded by the original colonies.

This means that we have perhaps a 120 years data at most. The logbooks from naval and convict ships would help push this back a further hundred years and help us establish whether the recent drought in New South Wales is evidence of climate change, or if the area is subject to periods of crippling drought as in the Federation drought of the 1900's.

Saturday, 12 June 2010

Archibald prize in Goulburn

Two years out of three we take a trip to Sydney to see that year’s Archibald prize finalists at AGNSW.

This was one of the years we didn’t, but instead went to Coonabarabran for a long weekend. However the NSW government has been touring the Archibald prize finalists round the state, and this month it was Goulburn’s turn, only an hour down the freeway, so off we went on a crisp cold southern Highlands winter day.

I’m sure the organisers were glad there was a small set of finalists this year, as they only just fitted into the exhibition space at Goulburn arts centre. Of the finalists I particularly liked

None of the rest said anything much to me, but as always your mileage may vary.

Afterwards, lunch in a cafe overlooking the square and a wander looking at the late nineteenth century and early twentieth century architecture, including some nice art deco houses. A return trip for some architectural photography is on the cards while we still have the winter light…

And it was enjoyable to have some cultural downtime after what had been fairly intense week with the ANDS bootcamp (and the joys of being employed by the host institution which meant lunchtime networking time disappeared into solving other unrelated problems with other projects), reconnecting with old friends and colleagues from the digital preservation world, not to mention meetings with potential project partners.

To add to the insanity I’ve also been getting stuff together for a project bamboo meeting in the states next week, which made the experience of a leisurely afternoon in Goulburn all the more precious ….

{My rather disjointed bootcamp notes are also online}

Friday, 11 June 2010

Friday, 4 June 2010

professional readers

Scaling out again from my posts on e-readers and linearity I came across an interesting study in Digital Humanities Quarterly which almost, but doesn't answer the question.

What it does show is that professional readers are gadabouts - they scroll back and forth, fork off to check references and bibliographies, hook into google scholar and wikipedia, wish to both annotate and take notes.

So any academic reader system would have to support:
  • annotation - storing documents in a revisable format that allows the embedding of location specific links
  • split view of footnotes/references - I recently came across an interesting variant of this in which the author published his book online as a pdf but the references were stored on a separate website ...
  • internet connectivity
  • decent editor and accompanying not organising system - putting notes into aself documenting structure - prehaps something like singleuser sharepoint
All of this would be quite easy to mock up - meaning it might be worthwhile (and simple) to do some usability testing to determine which fetures people really require and which they do not ...

Tuesday, 1 June 2010

blogging, tweeting and tumbling in quasi academia

I started this blog (actually I didn't - I had another blog which disappeared, demonstrating the impermanence of social media) really as a way to write down my thoughts and comments about those aspects of digital media that interested me, along with some random ruminations on everything from Byzantine Cornwall to public toilets, basically because I find the act of writing things down clarifies my thoughts (sometimes more effectively that others) and sharpens me up.

To this I've added a twitter feed as a self documenting set of interesting links - basically as a way of capturing things that I find interesting at the moment, be it e-books or early medieval history. I could equally well record the posts I found interesting and do a daily or weekly post. Of course just posting links is not always useful - sometimes one needs to add a comment or a note to the link, so as well as twitter I now use the tumblr microblogging service as a sort of commonplace book - post links with notes. And I've used flickr to post images, and scribd for text, not to mention wikidot for more involved note writing, and of course slideshare to post presentations.

And of course google docs for more involved writing, as well as material in progress that only needs to be shared to a select list, not to mention dropbox and its public share facility. I've also got a whole load of both work and personal interest collateral on my windows live skydrive.

So what's the point?

Well thinking about it, it looks very much to me like I've built myself a toolkit of resources to both do my work, and follow my interests, and, as I've said before a lot of what I do is not immensely different from the processes of scholarly life. It's also built around free resources available out there on the internet.

And of course, in exchange, I've released my content out into the wild, which might, or might not be sensible, but I doubt if any of it is going to make me a millionaire.

Yet there seems to be belief in some quarters that special workbenches, or application portfolios are required to enable collaboration and exchange among the academic disciplines. And this is something I have difficulty with.

Professionally, on campus we have a wordpress based blogging service and a Sakai based collaboration facility - basically as a place where people working on a shared project can upload and share material. Both are perfectly adequate, but differ not a whit from any blogging service or from using a hosted wiki service - or indeed google docs document sharing.

Which makes me ask, why does the belief persist in this need for special discipline specific software portfolios? Yes, there is a case for keeping work in progress in house for reasons of intellectual and in providing facilities to let people build and structure material online.

But specialist portfolios? Am I missing something?