Friday, 10 October 2025

Guerilla cataloguing - part 0

 I've mentioned before that we planned to recatalogue the heritage book collection using LibraryThing, the heritage book collection being the contents of the Athenaeum when it functioned as the town library in Stanley.

As far as we can tell, they hardly ever deaccessioned anything giving us a picture of changing reading tastes from sometime around 1862 to 1971 when it ceased to function as a library.

Actually, I suspect tastes haven't changed much, given the number of early copies we have of novels by Louisa M Alcott, Mary Elizabeth Braddon, Wilkie Collins and the rest - clearly the nineteenth century subscribers to the library had same liking for mysteries and sensation novels as we do today.

Until we try it, we've no real idea how well recataloguing with LibraryThing and our proposed methdology is going to work.

To refine and document our procedures we are going to run a pilot project on a few shelves to see how well it works and if it works well, we'll turn it into a guerilla project where we basically just do it, and don't worry overmuch about deadlines or formal project plans.

There is an intention to try and get other people involved so we can turn the project round fairly quickly, so we do need a simple and robust set of procedures so we can bring people on board and get them up to speed - quite different from the documentation of Dow's and Lake View where there was only me and the main reason for documenting procedures was to avoid drift and capture any changes to the methdology.

So today it was part 0 of the exercise - creating an account on LibraryThing for the Athenaeum, and as part of what we want out of it is a set of MARC records to allow us to port the catalogued data to another library system, identify some tools for verifying and manipulating MARC records, especially as instead of class marks or any standard cataloguing scheme, the original spreadsheet used shelf position.

This is worse than it sounds - for the thirdmost book from the left on the front row of shelf C the shelfmark is C3F, and the thirdmost book from the left on the rear row the shelfmark is C3B. Unfortunately there's no guarantee that there are the same number of books in the front and back rows - as a scheme it's almost as eccentric as the Cotton Collection classification scheme.

So basically, we need to be able to validate the MARC output.

MARC is a binary format dating from the early days of library computing and, like BibTeX, is essentially a lowest common denominator format, ie one most other systems can read and process.

So, what we need is a utility that can read the binary MARC file and display the file in a human readable form - something that with MARC is a bit of an exaggeration.

Now, the last time I did any serious work with MARC was twenty years ago when I wrote a simple parser in Perl to take a set of MARC records and format them so that the records looked like old fashioned card catalogue images.

I forget why I was asked to do this, but I remember looking out at the rain coming down on the museum car park while I fiddled with regular expressions.

So we needed something to let us examine the contents of MARC files, and given that we have a budget of zero dollars and zero cents for this exercise it had to be both free and public domain.

Well, there's not a lot of choice - basically it seems to come down to Terry Reese's MarcEdit, which has the merit of being endorsed by the Library of Congress, FastMRCView, produced by the Russian State Library (formerly the Lenin Library) in Moscow, and the online only MRV MARC Record viewer.

Otherwise there don't seem to be a lot of options out there, but it's quite possible that I've missed a couple of other public domain applications, but playing with MARC seems to be very much a minority sport. 

I've decided, quite unilaterally to go with both MarcEdit and FastMRCView in the pilot and compare the output, while both seem to do what it says on the tin, there's always a risk that one application interprets the data slightly differently from another.

FastMRCView is a windows only application, while MarcEdit comes in Windows, Linux and OS X flavours. As most of the prior work on the catalogue has been done on windows there's no pressing need to change operating systems.

So, we have our account and some software that looks as if it might help with the gnarly stuff, all that remains is try and see is how well our proposed methodology works in practice ...

Wednesday, 8 October 2025

Changes to blogger.com

 Google appears to have decided that the way to improve the blogger experience is to add some AI generated stuff to it - see the following couple of screenshots


I'm doubtful about this, especially the latter option given that Google's AI enabled search often gets things wrong if it's an obscure topic (like the ones I post on).

As to what a 'Google experience' constitutes, I'm not sure.

For this reason I'm going to ignore these new options for the moment ...



 


Friday, 3 October 2025

Plant remains in heritage books

 Up at the Athenaeum today we had a little conundrum.

We had been donated a book dating from the early 1870s, which had been given as a Sunday School prize to a member of a local family still resident in the area.

The book's exact provenance is unknown - there are some markings in pencil that suggest that at some time it had been resold in a second hand book store, but there's no doubt about its origins - the original dedication is intact which gives the name of the recipient, the date, and the location - in this case Three Mile, a now vanished mining settlement on the outside of Beechworth.

Unfortunately, the book is a wreck. The spine's is broken, there are loose pages, possible insect damage, and foxing, and is probably not worth conserving, but might be worth retaining as is because of its local connection, especially as the family still live in the area.

But when I was leafing through it to check for damage I found this


someone had at some point put a small plant inside, possibly as a keepsake or a bookmark.

Now, if we decided to preserve the book because of its connection to a local family, rather than simply photographing the dedication on the fly leaf, what do we do about the plant remains?

Well I didn't know. Google was singularly useless, so I appealed to mastodon.

No one replied, but I had a brainwave.

When I was a much younger man, I had a girlfriend who was a field botanist.

When we went for a bushwalk, if she found an interesting plant she hadn't seen before she would take a sample, wrap it in a bit of newsprint and put it inside a field guide for later identification.

Putting it inside a field guide kept the sample flat and the newsprint absorbed the moisture (more or less).

Proper herbaria - reference collections of dried plants - are a little more elaborate, but not by much, with the plants being pressed flat on absorbent acid free paper and then transferred to a fresh sheet of archival paper and attached with archive quality paper tape and then stored in a sleeve or folder.

And this gave me an answer to my own question

1) transfer item to sheet of archival paper
2) secure in place with archival tape
3) photograph, document etc
4) fold paper to make a packet without damaging the item
5) place in archival storage box in labelled acid free or tyvek envelope

Given that the book is so damaged if we decided to retain it would probably make sense to tie it up with cotton tape and place in an archive box, in which case we would simply put the plant packet in the box along with the book, that way we keep the association between the two objects ...
 


Wednesday, 1 October 2025

Of internet speeds past

 This morning I tooted that our internet speed had jumped to about half a gig, something that is quite amazing in terms of infrastructure for rural Victoria. Admittedly it's only that fast on download, upload speeds are still comparatively slow


but basically fast enough that you really don't need to worry about speed and latency when moving data about. In fact, compared to my first year documenting Dow's when I would upload my days work, typically 70 or 80 jpegs and and a few spreadsheets, at home, I would flood our ADSL connection, it seems pretty magical.

Then, I couldn't actually upload the data at Dow's, the internet was simply too slow down there, so I ended up resorting to sneakernet and saving my work to a USB drive before uploading it at home.

At the time that our fast ADSL connection seemed fairly zippy, especially compared to our house in Canberra where our ADSL connection was incredibly slow and I ended up investing in a 3G router that was plugged into the ISP's modem.

The 3G router used a USB stick modem to connect to the internet, but could be configured to use the ethernet connection to the ISP's modem by default and only fail over to the 3G connection if our rather flaky connection over the old copper wire phone system went away - which it did every time in rained

The fact the phone cable went via our neighbour's apple tree probably didn't help much either..

Before then we had dialup over a 56k modem.

But that wasn't our first dialup internet.



Around 1990 or 1991 I bought a Global Village Teleport Bronze 2400 baud modem which I plugged into the back of my Mac Classic.

There was something quite magical then about being able to open a terminal session and log into the dialup gateway of the university where I then worked and check the health of servers, send emails, and upload and download documents to work on at home.

This was at a time before the worldwide web and text based systems such as gopher were as sophisticated as it got, and there were no real ISPs (in fact we had to shoot down a thought bubble from marketing about starting an ISP in the mid nineties, instead we used to suggest that people use the British Library's service which was a rebadged version of one of the big commercial ISPs.)

It was of course a simpler time.

Letters still came in the mail, and if you needed to order something you either sent the order in the mail, or if it was urgent, by fax, and the internet was really still just an academic plaything.

Contrast that to today, where the internet is essential to just about everything we do, as was shown in the case of Tonga when a volcanic eruption not only cut off the connection to the rest of the planet, but between the main island and outlying islands.

The loss of the internet was crippling, all the more so because the previous satellite based service had been abandoned because the new service was just so much better, and everything, and I mean everything went via the now broken undersea cable ...


Sunday, 14 September 2025

Using Acrobat's AI summaries with Trove

In my little bits of nineteenth century historical research I use digitised newspaper resources a lot. 

The various digitised resources I use most often are nineteenth century Scottish newspapers via the SLV's subscription to Gale Newsvault for family history stuff, The Times of London's archives again through the SLV, Welsh Newspapers Online, Papers Past NZ, and above all, the NLA's Trove.

Trove is undoubtedly a great resource, but the quality of the digitised text, to put it politely, is variable.

Trove does provide OCR's summaries of the articles, but the quality of the digitised text can make the OCR'd text read as if it had been transcribed by a Martian - strange combinations of letters and punctuation followed by gobbets of reasonable text.

So, for years, what I have done is use the download option to generate a pdf, download the pdf to an ipad, and then sit and make notes on a 'proper' computer.

Latterly, if the pdf is too hard on the human eyeball, I've used J's old iMac, which now runs Linux, and Okular to give me a bigger image at a decent resolution to work with, and that's worked pretty well as a workflow.

Now, as I'm sure you're aware if you're an Acrobat user, Acrobat now behaves like an enthusiastic puppy, always asking if you want it to generate a summary of the document.

I've tended to ignore it, really because most of the PDF documents I look at on windows are boring things like credit card and electricity account statements, and there's usually only two important bits of information - how much we owe and when is payment due.

But instead of  doing the majority of my work on a linux machine as I usually do, I researched the Panjdeh incident on my Windows machine, and typed my notes into Geany on the old Chromebook I installed Linux on, really as a way of assessing the usefulness of the converted Chromebook.

(Answer, very useful, and good battery life to boot).

Anyway, as I was working on Windows, Acrobat came along wagging its little tail,  offering to generate a summary of every pdf document I opened.

So, for a number of longer documents, including some with poor quality OCR'd text, I did.

And they were surprisingly good, and the AI summary engine seemed to deal reasonably well with poorer quality scanned text, producing reasonable and good quality précis of the article texts.

Obviously you need to check the text yourself, but using AI text summaries turned out to be a useful way of assessing if the article was worth reading, it's not the first time I've slogged through a report of court proceedings to find that the report didn't add anything to what I already knew.

It's by no means a panacea, but it's certainly a valuable tool...

Thursday, 11 September 2025

What happens to our photographs when we die?

 An interesting little question popped into my head - what happens to our digital photographs when we die?

Of course we've all wrung our hands about how letters and postcards have been replaced by email meaning that future generations have lost access to our correspondence, denying cultural historians access to sources that describe how people felt about things, but unless I'm very much mistaken, people's digital photographs have not really been thought about.

For example, and this shows the value of sometimes inconsequential seeming objects,I recently picked up a British World War One propaganda postcard from a postcard trading site. Transcribing it turned out to be interesting, with its hint of war weariness among the population as well as worries over the risk of German air raids.

Interesting, and something that one couldn't do about a contemporary conflict, such as that in Ukraine, because all the communication involved would be digital, and I don't see people collecting 100 year old WhatsApp messages they way they used to hang onto (and collect) old postcards.

Now obviously, one doesn't want to keep everything. Broadly speaking, there are two sorts of photographs in people's collections - the transitory and the significant.

The transitory are images like the cracked tail light on a rental car - you photograph it to show it was pre-existing damage, or the back of a wi-fi router to record the password.

Then there's the significant - examples being all my artefact photographs for the National Trust, photographs of old buildings, J's records of her artworks, and so on.

Once they would have been boxes of 35mm slides, and now they exist on a server somewhere.

And of course not everything physical survives - my geeky teenage photographs of closed railway stations in Scotland have gone to landfill in the course of various moves and relocations, along with pictures of former girlfriends, camping trips and the like.

Some of these may have had some value, some not.

And so with digital images, some have significance, for example some of my Trust photographs show the state of decay for some artefacts, and might be of value to future conservators, etc.

And obviously some work has been preserved - for example I know that some of my artefact photographs have been archived, but not all of them, and of course I don't know which ones.

And increasingly there is a problem.

People's collections of potentially archivable material are changing - emails have replaced paper, digital photographs have replaced analogue film, etc etc.

And of course, there's also the problem of obsolete media - recordings on cassette tape, video tapes and the rest, plus if they were digitised, where did  the digitised version end up, and how is it preserved?

Answers on a postcard?

Saturday, 6 September 2025

Multi factor authentication and the outback

 Australia is a big, really big, sprawling country, and as a consequence there's a lot of places you don't get mobile coverage.

Sometimes you can get a wifi connection because the local pub has satellite wifi.

If it's Starlink, it's usually not too bad, and wifi calling and text messages can get through. 

If however, it's the NBN's aging SkyMuster, or some other solution it can be too slow for wifi calling, and guess what, text messages sometimes don't arrive.

I'm talking seriously slow, the sort of speeds that make you long for character mode email and text based web browsing.

Really frustrating.

And of course you can't then complete the authentication process.

And Google's 'check your other device' solution can be just as bad, especially when you don't actually have your other device to hand, like it's a couple of hundred kilometres away.

The solution, of course, is to do all your set up somewhere with white lines and traffic lights before you go bush and making sure you click the 'remember me' box if there is one.

Of course, you don't always remember...