Saturday, 15 February 2020

The costs of citizen science

Way back in 1984 I started on my first proper job after graduate school, working for an environmental science field centre.

I was paid, not a lot, but enough to get by on.

There were graduate students working there who had a bit of funding, some people on internships who got a little bit of money for survey work and so on.

However, most of the basic overheads were paid for, so other than rubber boots and army surplus parkas, going to work didn't really cost anyone anything, other than a personal copy of a plant reference book or a better headtorch.

Fast forward to the present day.

I'm now retired and working happily as a volunteer, documenting artefacts for the National Trust.

This actually costs me something to do, buying rubber gloves, bits and pieces to aid the documentation process, such as extra usb sticks and gizmos to read sd cards.

It's not a lot, and there's a degree of crossover with what I spend on my hobby of family history, so I'm happy to spend the money because I enjoy what I'm doing.

Treat it as a hobby and it's much the same as what J spends on art materials, a cost of doing something you find fun and enjoyable. If instead, I enjoyed breeding orchids, recording old churches, or censusing bats, there would still be an overhead.

And then there's the all the other things that go around it, my office 365 subscription, really to pay for cloud storage of my digitised materials, evernote subscription, replacement printer cartridges, a wordpress account, and a few other memberships.

Now, I'm in the fortunate position of being able to afford to do this.

I'm also by no means unique in doing this, since I've retired I've met amateur astronomers, former professional botanists, historians and so on who are doing good work for the fun of doing so.

Other people of course, may not be so fortunate and find it difficult to pursue citizen research despite being well qualified to do so.

I don't have an answer to the funding conundrum, but as there is increasingly less and less state and federal funding for humanities research and simple observational scientific research, such as recording plant and animal species as they recolonize bushfire damaged areas, it's inevitable that 'big' research is going to be more and more dependent on low level citizen research volunteer efforts.

But I do have a suggestion. Most people who carry out citizen research are members of a local field studies group, history or archaeological society (by the way, I'm not).

If we had a citizen research body that people could register their projects with via accredited local bodies such as history societies, we could perhaps have the citizen research body negotiate small discounts with suppliers.

That way no money changes hands, but these modest costs might help see a project through to completion, as well as getting the data out there.

No more daily tweet summary

For longer than I can remember I've used the services of paper.li to post a daily twitter summary at 12.43 Australian east coast time to where else, twitter.

No more.


I've been running in freetard mode, using a basic account to provide a daily summary of the most popular tweets of both myself and the people I follow.

As I've said elsewhere, I started using twitter as a way of sharing interesting snippets with my team, colleagues and indeed anyone who wanted to follow me. Over the years my feed has changed from being focused on digital preservation and archiving with a bit of classical and medieval history thrown in for amusement, to a mixture of material about history, cultural repatriation and a smidgin of technical stuff - especially after I retired and no longer had a team of people.

The paper.li thing came about because some people asked me if I could post a summary every so often, and I was too lazy to write a script to do it for me, and what's more, paper.li added value in picking other popular posts from people I follow.

So I started pushing out an automated daily summary using their service.

Well, all good things come to an end.

For perfectly understandable reasons, the paper.li people have recently started being less generous as regards what you got with a basic (free) account, and I didn't want to pay for a pro account.

I suppose if I had my own consultancy and more importantly, it made money, I could have claimed a pro subscription as marketing, but I don't, so I havn't, and being genuinely retired I have no external lines of funding to tap.

As an aside, this is a little commented on problem with volunteer working (or if you're grand citizen research) - everything from my blue nitrile gloves for handling artefacts at the pharmacy to my Office 365 subscription comes out of my own money. I don't begrudge a cent of it, after all, it helps keep me sane, but there are limits as to what I can reasonably afford.

So, the long and short of it was that when you looked at what you got from the new basic plan it wasn't worth continuing.

I did think about compiling a weekly summary, but rejected that idea - explorator is better established and does a better job, and I'm not sure if I can deliver the commitment required.

So that's it, I'll still keep tweeting about the stuff that interests me, but there will be no more 12.43 summary carefully timed for an Australian lunch break ...

Tuesday, 21 January 2020

Just for fun, I built another linux box this morning ...

It was wet this morning, and I was feeling fidgety, so I decided to replace the Xubuntu install on my MSI netbook.

Truth be told, I was never a 100% happy with Xubuntu on the netbook - it always seemed a little slow. If it hadn't been for the fact that development was ceasing for Crunchbang about the time I upgraded it from Windows 7 Home to Xubuntu, I'd probably have gone for Crunchbang.

I've really not used it over the last three or four years, which is a bit of a pity, as the keyboard is pretty nice to type on, so I thought I might I might try installing something a little less resource hungry than Xubuntu.

Well, I didn't feel about agonising overmuch about which distro to go for, I decided on BunsenLabs linux, which was what Crunchbang recommended moving to after development ceased.

So, first of all see if it works - I downloaded the appropriate iso and wrote it to a USB stick with Rufus, booted the netbook with it. As always with live systems booting from a USB it was a little slow to start but once running, appeared to be pretty good.

Since there wasn't anything of value on the netbook I went for the 'nuke and reinstall' option where it wiped the disk and installed BunsenLabs Linux in a fresh install.

Well, that didn't work - the installer claimed that it couldn't find the cd and bombed out - something pretty crucial to the install process.

Turns out that Rufus by default writes the disk image in iso mode, and BunsenLabs wants the image written as dd image.

Now when I did this back in the early twentyteens, I always wrote my bootable usb sticks using dd.

Lesson learned, Rufus does have an option for writing a dd style image and second time around the install worked perfectly.

As always, booting the machine for the first time involved that moment of uncertainty when, after a screen of startup messages it sits there with a blank screen, but then suddenly it leaps into life. Thereafter startup is pretty fast, and after login you're into the fairly austere default openbox desktop:




The install is fairly scant as regards applications, and strangely does not come with an email client (thunderbird works fine, you just need to install it - the Synaptic package manager is included, or else you can be old school like me using sudo apt-get install package and then using the menu editor to add it to the desktop.)

The browser is firefox, and it has a little bit of trouble with the smaller netbook screen size, but it taking it out of full screen mode solves that problem.

As I'm envisaging the machine as primarily a writing machine, the only other applications I added were focuswriter and retext for producing documentation.

And that was about it, I didn't even bother about setting up printing as I'm envisaging either emailing material to myself or using the web based uploader to put things on One Drive ...


Sunday, 19 January 2020

Yesterday, I built a linux machine ...

It might seem strange, but it's been five possibly six years since I installed any variant of linux on a real machine, the last being my old MSI netbook.

However,  yesterday I installed linux onto my old Dell laptop - and I do mean old - embarrassingly so, it dates from 2010.

I reckoned that installing linux was the best way of wiping it prior to disposal.

Ubuntu now comes with a snazzy graphical installer rather than the old text based one that I was familiar with so I decided to simply follow the bouncing ball and do a standard install.

I wrote myself a live USB stick on a windows machine with Rufus, plugged the old Dell into a power socket and booted it from the USB stick.

First time around it offered to automatically partition the disk to allow me to keep my Windows 10 install by creating an 8GB partition for windows and the same for Ubuntu.
I thought I'd try installing it as a dual boot machine as a test, as my old second hand thinkpad is still on Windows 7.

While Windows 7 is now off support I still need a windows system on that machine  (or more accurately a couple of Windows applications OneDrive and OneNote) for the Dow's pharmacy documentation project I'm working on, and I thought it might be useful to run a second supported  operating system on the machine.

So I let the installer suggest the partition sizes - remember I was experimenting to see how good or otherwise the new installer was and was simply following the defaults.

I thought the suggested partition sizes  were perhaps a little optimistic given that Windows 10 typically takes about 16GB, but I went with them out of curiosity, even though a full install of Ubuntu desktop is said to need something between 12 and 15GB.

As I expected, the install crapped out at the end of the process having filled the Ubuntu partition.

Lesson learned - I'd say give both Ubuntu and Windows around 25GB each. On a machine with a 128GB SSD that might be a bit generous if you have a lot of data, but if you've 256GB or more of local store I don't see you having a problem doing this.

I didn't actually repeat the exercise or test to see if Windows 10 still booted with its squeezed partition, truth be told I had other things to do that evening than play with partition managers, so I simply repeated the install process, but second time around went for the 'nuke everything' option.

This worked perfectly - in about forty minutes I had a working Ubuntu machine, and what's more one that ran reasonably well - which was gratifying given the age of the hardware.

To allow me to play with it a little, I revived my Ubuntu One account, added livepatch support, installed a couple of extra application - kate and focuswriter and added a printer, configuring my FX Docuprint 115w as a Brother HL-1050, set up mail and firefox syncing.

My plan is to use it for a few days to play about with, including the Gramps genealogy package whose lead development platform is linux to see how things go.

Once I've finished playing my plan is to reinstall linux, perhaps a pure Debian install this time and set it up with a single default user before taking it for disposal as e-waste ...

Monday, 13 January 2020

Xiao Mi air purifier pro

Well, I've done something that I never thought I'd have to do in Australia - I've bought an air purifier.



The reason for doing so is simple - we live in Beechworth on the edge of the Alpine fire zone which means that we've suffering from poor air quality for over a month. 

J is asthmatic and has been suffering from the effects of the smoke, and I've been feeling a tightening in my chest on bad air days, and even the cat has been wheezing occasionally. As our house is built out of wood with an 1880's core it's intrinsically leaky and there's little one can do to stop smoke sneaking in.

So we bought an air purifier. 

I bought it from a reseller on ebay on the basis of product reviews alone - they're a popular model in China and replacement filters are widely available online, we just need to remember to order once the filter starts to deteriorate, say drop below 30% capacity.

It incorporates a HEPA filter, and came with an Australian power cord. Getting it working was as simple as unboxing it and plugging it in.

There's a self explanatory little panel on its front that gives you the current PM2.5 rating per µg/m² and a coloured ring - green for good/acceptable, orange for not the best, and red for I wouldn't be here if I were you.

So far so good.

The device is described as a smart wifi enabled device, which basically allows you to use your phone to turn it on and off, as well as view device status.

This requires you to download and install the Mi Home app, and then create an account. This proved to be more problematical than I expected. I started off by using my Apple icloud.com email address and for some reason the activation 'confirm your email' messages just disappeared. Changing to my outlook.com account was rather more successful.

You then need to enter your home network address details, then disconnect your phone from your home network, connect to the internal network in the machine, and then get your phone to transfer the network configuration details. The device then fiddles about and eventually connects to your home network,

It would have been simpler to provide a minimal internal webserver to help configure the machine

Once done the machine shows up on the network like this:


The device only supports a 2.4GHz connection, but it does warn you of this.

The actual app data display is fairly self explanatory

in principle you could add multiple devices to the app, including multiple air purifiers.

For the moment, the device seems to work and seems to be fairly effective at scrubbing smoke out of the air ...

Saturday, 11 January 2020

small museums and natural disasters

Back in December, I started work on a bushfire survival plan for the old pharmacy I'm documenting.

At the time I didn't grasp, in fact I think very few people grasped the scale of the tragedy that was about to engulf the south east of Australia.

In retrospect, my plan is not the best.

My original idea was that we would grab key items, put them (carefully) in the back of a car and drive the vehicle to a place of safety. The plan would work for a small isolated grass fire, but not for the massive fires that have been raging.

I've also read a number of plans and planning documents by a number of museum professionals since then. They're all deficient, as applied to small museums run by volunteers.

Why?


  • in the face of a major bushfire emergency it is unfair and wrong to expect volunteers to devote time to saving museum contents when their own homes and loved ones may be at risk. This equally applies to salaried staff.
  • evacuating key items will not work unless you have place of safety arrangements in place in advance - the distances and amount of time involved make it impracticable to simply drive to a place of safety and wait out the emergency
So, what needs to change?

We still need to decide which items if any are to be saved. 

As soon as there is a 'watch and act' alert - I'm using the Victorian terms here - items need to be packed and taken somewhere. Given that 'watch and act' alerts have a reasonable amount of leeway built in there should be time to do this, and also allow staff and volunteers to return to look after their own homes.

The 'somewhere' is also important - we need to know in advance where the items are going - preferably some storage location in a local museum or art galley in a town though by the fire authorities to be defensible, and that needs a prior agreement.

There also has to be an understanding that situations may change rapidly, so people (a) need to know in advance what they might have to do and (b) there needs to be backup so if one volunteer has to bail out to evacuate their own home, someone else can provide cover.

All much more complicated, and possibly, in the case of a museum or historic house reliant on volunteers, impracticable.

Which of course highlights the need for excellent record keeping and cataloguing, and where possible, digitisation. They are after all just things, and while their loss may be irreplacable, knowing what has been lost, means that we can reconstruct things, albeit digitally.

Friday, 3 January 2020

Digital preservation and preserving the digital

I've been doing digital preservation stuff in one way or another for over twenty years, managing projects, building servers, installing systems, even writing code.

In the early days it was all a bit finger in the ear - no agreed standards, no agreed preservation formats, and lots of home grown and ultimately unmaintainable solutions.

Nowadays, and I'm possibly out of the loop a little, being retired, it's basically a just add dollars problem.

There is a tacit agreement about what to do, how to do it, and what software to deploy.

For some people a proprietary solution with predictable recurrent costs makes sense, others may find an open source solution with its low entry costs. but implied long term funding for software maintenance ( and by implication maintaining a specialist team) more to their taste.

Either are good, either will work, it's basically a financial decision as how best to proceed.

The days of having a test server sitting under your desk (and I've been there) are gone.

But there's an elephant in the room.

To date most digital preservation efforts have been focused on preserving digitised artefacts.

Books, early modern ballad sheets, insects collected in the nineteenth century, rude french postcards, etc etc.

And this is partly because a lot of digital preservation efforts have been driven by librarians and archivists who wished to digitise items under their care for preservation purposes.

And this model works well, and can be easily extended to born digital records, be they photographs, medical records, or research data - it's all ones and zeros and we know how to store them.

And being linear media with a beginning and an end we can read the file as long as we understand the format.

What it doesn't work well for is for dynamic hypertextual resources that do not have beginnings or ends but are instead database driven query centric artefacts:

From memory, the wagiman dictionary was written mostly in perl and did a whole lot of queries on a database. I know of other similar projects, such as the Malay Concordance Project, that use similar technologies.

Essentially what they have is a web based front end, and a software mechanism for making queries into a database. The precise details of how individual projects work are not really relevant, but what is is that the webserver needs to be maintained and not only has to be upgraded to handle modern browsers, it needs to be made secure, the database software needs to be maintained and of course the query mechanism needs to be upgraded.

Big commercial sites do this all the time of course, but academic projects suffer from the curse of the three year funding cycle - once it's developed there's usually no funding for sustainability which means that even if it becomes very useful, no one's there to upgrade the environment, which means that it starts to suffer from bitrot. Left alone, sitting on a single machine with a fixed operating system version it would probably run forever, but hardware dies, operating systems are upgraded and all sorts of incompatibilities set in.

While it's not quite the same thing, go and take that old tablet or computer you've had sitting on a shelf since 2010 and have never quite got round to throwing out. Try accessing some internet based services. Some will work, some won't.

And the reason is of course that things have change in the intervening ten years. New versions, new protocols and so on.

So, what to do?

One solution is to build a virtual machine using all the old versions of software (assuming of course you can virtualise things successfully). The people who get old computer games running have a lot to teach us here - most games used every tweak possible to get as much performance as possible out of the hardware of the time. The fact that hardware is now immensely more powerful than even five years ago means that the performance cost of running emulations can be ignored, even if they're particularly complex.

This gets rids of the hardware maintenance problem - as long as your virtualisation system allows. and continues to allow you to emulate a machine that old you're home dry - except you're not.

You need to think about secure access, and that probably means a second machine that is outward facing and passes the queries to your virtual machine. This isn't rocket science, there's a surprising number of commercial legacy systems out there - business process automation solutions for example - that work like this.

The other thing that needs to be done is that the system needs to be documented. People forget, people retire to Bali and so on, and sooner or later the software solution will need to be fixed because of an unforeseen problem ...