Sunday, 14 September 2025

Using Acrobat's AI summaries with Trove

In my little bits of nineteenth century historical research I use digitised newspaper resources a lot. 

The various digitised resources I use most often are nineteenth century Scottish newspapers via the SLV's subscription to Gale Newsvault for family history stuff, The Times of London's archives again through the SLV, Welsh Newspapers Online, Papers Past NZ, and above all, the NLA's Trove.

Trove is undoubtedly a great resource, but the quality of the digitised text, to put it politely, is variable.

Trove does provide OCR's summaries of the articles, but the quality of the digitised text can make the OCR'd text read as if it had been transcribed by a Martian - strange combinations of letters and punctuation followed by gobbets of reasonable text.

So, for years, what I have done is use the download option to generate a pdf, download the pdf to an ipad, and then sit and make notes on a 'proper' computer.

Latterly, if the pdf is too hard on the human eyeball, I've used J's old iMac, which now runs Linux, and Okular to give me a bigger image at a decent resolution to work with, and that's worked pretty well as a workflow.

Now, as I'm sure you're aware if you're an Acrobat user, Acrobat now behaves like an enthusiastic puppy, always asking if you want it to generate a summary of the document.

I've tended to ignore it, really because most of the PDF documents I look at on windows are boring things like credit card and electricity account statements, and there's usually only two important bits of information - how much we owe and when is payment due.

But instead of  doing the majority of my work on a linux machine as I usually do, I researched the Panjdeh incident on my Windows machine, and typed my notes into Geany on the old Chromebook I installed Linux on, really as a way of assessing the usefulness of the converted Chromebook.

(Answer, very useful, and good battery life to boot).

Anyway, as I was working on Windows, Acrobat came along wagging its little tail,  offering to generate a summary of every pdf document I opened.

So, for a number of longer documents, including some with poor quality OCR'd text, I did.

And they were surprisingly good, and the AI summary engine seemed to deal reasonably well with poorer quality scanned text, producing reasonable and good quality précis of the article texts.

Obviously you need to check the text yourself, but using AI text summaries turned out to be a useful way of assessing if the article was worth reading, it's not the first time I've slogged through a report of court proceedings to find that the report didn't add anything to what I already knew.

It's by no means a panacea, but it's certainly a valuable tool...

Thursday, 11 September 2025

What happens to our photographs when we die?

 An interesting little question popped into my head - what happens to our digital photographs when we die?

Of course we've all wrung our hands about how letters and postcards have been replaced by email meaning that future generations have lost access to our correspondence, denying cultural historians access to sources that describe how people felt about things, but unless I'm very much mistaken, people's digital photographs have not really been thought about.

For example, and this shows the value of sometimes inconsequential seeming objects,I recently picked up a British World War One propaganda postcard from a postcard trading site. Transcribing it turned out to be interesting, with its hint of war weariness among the population as well as worries over the risk of German air raids.

Interesting, and something that one couldn't do about a contemporary conflict, such as that in Ukraine, because all the communication involved would be digital, and I don't see people collecting 100 year old WhatsApp messages they way they used to hang onto (and collect) old postcards.

Now obviously, one doesn't want to keep everything. Broadly speaking, there are two sorts of photographs in people's collections - the transitory and the significant.

The transitory are images like the cracked tail light on a rental car - you photograph it to show it was pre-existing damage, or the back of a wi-fi router to record the password.

Then there's the significant - examples being all my artefact photographs for the National Trust, photographs of old buildings, J's records of her artworks, and so on.

Once they would have been boxes of 35mm slides, and now they exist on a server somewhere.

And of course not everything physical survives - my geeky teenage photographs of closed railway stations in Scotland have gone to landfill in the course of various moves and relocations, along with pictures of former girlfriends, camping trips and the like.

Some of these may have had some value, some not.

And so with digital images, some have significance, for example some of my Trust photographs show the state of decay for some artefacts, and might be of value to future conservators, etc.

And obviously some work has been preserved - for example I know that some of my artefact photographs have been archived, but not all of them, and of course I don't know which ones.

And increasingly there is a problem.

People's collections of potentially archivable material are changing - emails have replaced paper, digital photographs have replaced analogue film, etc etc.

And of course, there's also the problem of obsolete media - recordings on cassette tape, video tapes and the rest, plus if they were digitised, where did  the digitised version end up, and how is it preserved?

Answers on a postcard?

Saturday, 6 September 2025

Multi factor authentication and the outback

 Australia is a big, really big, sprawling country, and as a consequence there's a lot of places you don't get mobile coverage.

Sometimes you can get a wifi connection because the local pub has satellite wifi.

If it's Starlink, it's usually not too bad, and wifi calling and text messages can get through. 

If however, it's the NBN's aging SkyMuster, or some other solution it can be too slow for wifi calling, and guess what, text messages sometimes don't arrive.

I'm talking seriously slow, the sort of speeds that make you long for character mode email and text based web browsing.

Really frustrating.

And of course you can't then complete the authentication process.

And Google's 'check your other device' solution can be just as bad, especially when you don't actually have your other device to hand, like it's a couple of hundred kilometres away.

The solution, of course, is to do all your set up somewhere with white lines and traffic lights before you go bush and making sure you click the 'remember me' box if there is one.

Of course, you don't always remember... 

Bunsen labs ditched

 I said I'd try Bunsen Labs Linux in a real world situation to do real work.

So I did.

Using Libre Office to review a document I started to get an annoying intermittent flicker - it could have been a latent hardware fault or it could be that the Radeon screen driver shipped with Bunsen Labs wasn't optimal for my hardware.

Well, only my pride was affected, I had very little work on the machine, so I wiped it and installed Ubuntu, remembering to click the third party drivers box.

I deliberately chose Ubuntu as they have particularly good support for Lenovo machines.

Well, changing operating systems seems to have cured the flicker problem (maybe).

It's certainly better but it does come back occasionally. The only thing to do is try it for some time and see if it is just as bad with Ubuntu as it was with Bunsen Labs.

(AMD also provide Radeon drivers for Ubuntu, and if the flicker comes back, I might well give these a go. Unfortunately, they don’t provide generic Debian drivers, and Bunsen Labs is based on generic Debian but there is a wiki page on AMD Radeon on Linux).

Bit of a pity, because I quite liked Bunsen Labs, but to be fair they did warn you on install it was a hobbyist supported distro, and that there might be problems ahead.

While I'm obviously disappointed, it won't stop me from trying Bunsen Labs again on other hardware...

[update 09/09/2025]

Well, I was still getting an intermittent flicker with Ubuntu 24, so I did a little digging.

lspci was correctly showing the graphics card to be an AMD Radeon, but as I still occasionally got a flicker, so nothing ventured I downloaded the latest AMD driver for the hell of it


The lspci output after downloading didn't seem to be a lot different, so I tried doing what I had been doing on Bunsen Labs when I got the annoying flicker and started paging through the document with Libre Office.

Well, on the basis of a sixty second test I'm not getting the flicker.

I'll work some more with it to make sure it's really gone ...