Monday, 3 April 2023

Is nine million enough to fund Trove ?


Trove has recently announced new funding from the Australian government.

The funding works out at around $9 million Australian a year, which sounds good, but is possibly a bit on the mean side.

So why do I say that?

Trove is essentially a digital repository, which means it consists of a database containing all the metadata that is searchable and a data store that contains all the digital objects.

While it makes sense to use good quality hardware to run Trove, the hardware required is standard off the shelf stuff and not particularly exotic – commodity servers will do nicely.

The database needs to be backed up and have measures in place to ensure resilience, the object store less so as it is usually only added to. All that’s required is a periodic incremental backup in case some space junk lands in the car park, and measures to guard against disk failure.

Again, all very standard. While you need some resilience, we’re not talking about the measures the big banks deploy to ensure 24h availability of their online banking solutions.

So, the hardware part is not expensive, nor are the maintenance requirements. Electricity costs for running the servers and keeping them cool may be a constraint, but data centres usually have negotiated contracts for the supply of power, so the ongoing costs are predictable.

Likewise, the costs of periodic hardware expansion and replacement should be reasonably predictable and relatively easy to budget for.

Then there’s the costs of digitisation itself. Again, commodity hardware is now good enough for most purposes. Film scanners, microfilm scanners basically consist of standard digital camera in a housing that allows you to advance and photograph each frame.

It some cases it has to be a manual process because of the poor quality of the original film, sometimes it can be almost automated.

Scanning old fragile materials such as old newspapers, bound nineteenth century periodicals, etc is more fraught and needs both specialist skills and equipment, but I suspect that the bulk of digitisation work is the scanning of previously microfilmed material.

So, nine million should be able to cover the costs of both maintaining Trove and financing the ongoing digitisation programme as far as hardware goes.

But there’s the human factor.

Recruiting and retaining a team of computing technicians and engineers in Canberra is not cheap. I know this I’ve been there.

There's continual demand for good people and given they are all public servants or contractors, it's quite easy for people to change jobs, and of course some departments can pay more than others.

The last time I had to deal with such things was seven years ago, and the figures quoted are based on 2016 costs. Given that over the last few years wages growth has been fairly low, my costs are probably not too far out.

Once you’ve added in the costs of superannuation, long service leave, payroll tax, plus some contingency funding for sick leave, parental leave, maternity cover and the rest, a decent, competent computer technician who spends her days swapping dead disks, checking hardware status, dealing with fan failures and the like will cost around $80,000. A software engineer, between $100,000 and $120k. A service manager, at least $150k – basically your humans will cost you around a million to keep things working.

I have no idea what a competent digitisation technician costs, but I would be surprised if it was much less than a computer technician, and of course you have some more senior digitisation staff to do quality control.

I don’t know how the digitisation team is structured, but I would guess it would cost around $750k to $1million a year, meaning that overall, your human resources costs are around $2million per annum, meaning that your actual operating budget is around $7million.

That is of course in Australian dollars, and you need to maintain some wiggle room given that most serious infrastructure has a price that is tied to the US dollar price, and so even though you are paying in Australian dollars, you have to allow for drops in the value of our dollar against the greenback.

I don’t know the details of Trove’s hardware costs or data centre costs so I can’t guesstimate their annual costs. I’m guessing that the hardware consists of a few racks of servers and disk in some anonymous government data centre in Fyshwick or Hume.

So, is seven million enough?


The hardware and infrastructure running costs are essentially fixed costs - while its possible to lengthen replacement cycles you can only sensibly go so far, which means that your only way of reducing costs is paying people less (or paying fewer people). In Canberra you cannot really pay less than the going rate, meaning that if nine million is not enough, the headcount needs to be reduces, with consquent impacts on service and new initiatives.

I don’t know enough to say for sure. Past experience makes me feel the budget is a little tight, but not impossibly so.

However, the really good thing is that it has a recurrent and defined budget allocation. That can only be a good thing…


