Friday, 7 November 2025

Guerilla cataloguing continued

 


Using Librarything's Overcat works well, but it's not perfect, occasionally failing to find books, even though they turn out to be in other user's collections, I'm guessing because some people have created manual entries for the books concerned from scratch when they couldn't find them via a catalogue search.

However, when I can't find a book, I've been carrying out a manual search of both the British Library and National Library of Scotland catalogues, and then creating a manual entry based on their data and noting the data sources used in the comments section.

Ideally I'd simply rerun the Overcat search from Library Thing against the British Library, but LibraryThing's link to the British Library catalogue is unreliable, so searching manually it is for the moment.

Given that I would guess that around 90% of the pre 1950's items in the collection were sourced from the UK, even though the book in question may have originally been published in the USA, so, so far there's no need to check the Library of Congress catalogue.

However, working with the books directly has benefits - for instance the Treloar's hygenic library label at the top of this article came from a 1930s edition of a Max Brand Western novel, suggesting that perhaps the Athenaeum was sometimes buying books second hand to add to their collection.

It also suggests that circulating libraries were still a thing in early 1930s Australia - post depression money was tight and being imported, books were relatively expensive. (Hygenic libraries were circulating libraries that made a point of sterlising books between loans, either by spraying them with antiseptic or placing them in an oven.)

One might have expected that public libraries might have taken up the slack, but perhaps it was the case that commercial circulating libraries provided a better choice of popular fiction. (I don't know enough about public libraries in 1930s Australia, I'm waving my hands here).

Also, sometimes on gets to touch history - in our collection we have an 1863 edition of Alice King's now forgotten three volume novel Eveline.

Forgotten now, but obviously very popular when it first appeared as the flyleaves of each volume - volume 1 is unfortunately missing - are endorsed 10 days allowed  in ink


suggesting that there was considerable demand, and that patrons could only sign the book out for 10 days rather than the more normal fourteen or twenty one days.

Friday, 31 October 2025

Guerilla cataloguing part 1

The nineteenth century Prussian general von Moltke the elder is reputed to have said 'No battle plan survives contact with the enemy'.

Well it wasn't quite as bad as that, but my first problem when I tried out our tentative cataloguing methodology was that LibraryThing's link to the British Library catalogue kept timing out on me.

However the link to LibraryThing's own Overcat database of library records was robust, so rather than using the British Library, our plan has changed to using Overcat in preference.

Not all records of books published in the nineteenth century are perfect, so sometimes a little bit of editing was required, but basically using Overcat with a little bit of cross checking with the National Library of Scotland and the British Library catalogues along the way our plan seems to work well, even if things went a bit slower than we hoped.

With only ten records so far it didn't seem worth doing a MARC export and then using something like FastMRCview to validate the output.

However, actually handling the books was quite valuable. By chance a number of the books we catalogued today were editions of Mary Elizabeth Braddon's novels.

Interestingly they still had paper stickers on the covers saying they were supplied through Mudie's Circulating Library.



George Mudie ran an important chain of circulating libraries in the mid to late nineteenth century in England charging his subscribers an annual fee of a guinea (£1-1s or a little over A$220 today using the Bank of England's inflation calculator) to borrow one book at a time - in comparison Netflix costs $120 a year with ads or $250 ad-free).

Circulating libraries were a middle class thing due to the up front subscribers' fee. They had a possibly undeserved reputation as a supplier of sensation novels to middle class women, and as a place where men and women could interact unchaperoned.

Mudie is also reputed to be responsible for the three volume novel format so common in the Victorian period as it allowed his libraries to lend out the volumes separately rather than have to stock multiple copies of in demand books.

And, as he bought so many copies of books he became an important wholesaler in his own right supplying books to overseas circulating libraries, including, quite obviously, the Athenaeum.

Incidentally the books had green covers with gold stamping. Fortunately they don't turn up in the list of known books where arsenic green bookcloth was used for the cover, but the list isn't exhaustive, so we  followed the sensible course of using nitrile gloves when handling them, rather than cloth gloves, or indeed handling them by hand.

I also learned a little bit about the business of publishing new editions of books in the late nineteenth century.

At that time books were still typeset by hand using movable type, much as they had been in Caxton's time.

However there was one important difference - once set and proofed they printers would make a papier mâché mould which they would then use to cast a single metal plate that they would use to print the page, and these moulds were called stereotypes.

This of course meant that the type could be quickly broken up and reused, and that, if they kept the moulds, they could quickly make a new set of printing plates if a book needed to be reprinted.

Sometimes, if you look at a late nineteenth century book it will have 'Stereotype Edition' on the title page, meaning that the book was printed by reusing moulds used to print a previous edition, rather than having the type reset.

Interesting what you can learn from cataloguing a few old books...

Thursday, 30 October 2025

Facebook, again

 


Two and a half years ago I abandoned social media, or more accurately these big behemoths that rule our lives, and I've been a lot better for it.

Sure, I've kept on blogging and post links the mastodon, but I've not really engaged with any of 'the socials'.

Unfortunately a lot of local history and community groups have continued to use Facebook and it has got to the point that I need (reluctantly) to rejoin Facebook, if only to lurk and look at posts ...

(I'm leaving it as an exercise for the interested to find my account and I'm not going to do any friend requests or anything like that - the whole experience of re joining has quite unsettling - people I don't know being suggested as friends and a waterfall of suggested posts, at best irrelevant, at worst, fascist right wing flag waving nonsense)

Wednesday, 29 October 2025

Cataloguing postcards

 

A few days ago I wrote about a procedure we had developed for cataloguing removable media at the Athenaeum.

I had been quite impressed by the sophistication of  the documentation provided by some of our contributors with human readable and self explanatory directory and file names, but inevitably there are going to be cases where the filenames and directory names are not human readable.

Now obviously we could rename the files and reorganise them but that's probably not sensible, as there may be references to the original filename in the files or accompanying documentation.

So I thought that probably the best solution would be to create a manifest file to be stored alongside the directory listings. 

As an experiment I thought I'd use as an example a German postcard from 1914 that I'd recently acquired

As with the original methodology for cataloguing removable media, making a manifest fil is actually quite easy if you use the command prompt (cmd.exe).

I'm quite systematic about how I document the various Victorian and Edwardian postcards I've collected over time and  store the scans and information about each postcard in a separate directory under an overall Postcards directory.

In this case the listing looks like this

Volume in drive C is Windows-SSD
 Volume Serial Number is 62F4-DEE9

 Directory of C:\Users\doug_\OneDrive\Victoriana\Postcards\Schwerin 1914


28/10/25  03:37 PM         3,141,869 2025_10_28 15_36 Office Lens.pdf
28/10/25  03:55 AM         1,905,786 IMG_0347.JPG
28/10/25  03:10 PM         1,851,390 IMG_0349.JPG
28/10/25  03:31 PM               377 schwerin.mkd
28/10/25  03:15 PM           545,497 schwerin.png
               5 File(s)      7,445,392 bytes
               2 Dir(s)  369,753,280,512 bytes free
Not particularly meaningful. 

However using the  tree command from the command prompt you can create a directory listing with 

tree /f > manifest.txt

This will give you a nice little tree listing in the directory which you can then annotate using Notepad or similar to create something like this

Folder PATH listing for volume Windows-SSD
Volume serial number is 62F4-DEE9
C:\USERS\DOUG_\ONEDRIVE\VICTORIANA\POSTCARDS\SCHWERIN 1914
    2025_10_28 15_36 Office Lens.pdf - pdf scan of postcard
    IMG_0347.JPG - face of postcard showing address
    IMG_0349.JPG - rear of postcard showing message
    schwerin.mkd - description of postcard in markdown format
    schwerin.png - montage of IMG_0347.jpg and IMG_0349.jpg

which gives you a human readable description of the contents stored in the same directory as the material you are documenting.

As always procedure is everything - if you always call the annotated file listing manifest.txt it will be consistent across all examples.

(And as a note for command prompt nerds I deliberately used tree/f rather than dir/b to create the directory listing. Using the tree command makes the process more general purpose and to take accounts of sub directories and their contents if present. 

As the Linux tree command works similarly it makes the procedure more general than relying on the traditional DOS directory command).

The actual procedure under Linux is slightly different

As the Linux version of tree creates its output file before enumerating the file list you can end up with manifest.txt appearing in the listing.

To avoid this use the command

tree -i > ../manifest.txt

which will create the file in the directory above the current working directory. The -i command suppresses the line drawing characters that give a representation of the directory structure. This creates a simple file that can be annotated as before, and once annotated the file can be moved to your preferred location.



Sunday, 26 October 2025

Baked beans and digital preservation

 


It was a wet cold Sunday morning here in North East Victoria, so we had beans on toast for breakfast and listened to the radio.

We prefer Wattie's beans, a New Zealand brand, because they are not quite as sweet as some of the other common brands.

Wattie's beans are almost unique in that they still come in a non-ringpull can, meaning that if you don't have the required access technology, in this case a can opener, you can't get at the beans.

And this is the first part of digital preservation - you need access to the appropriate technology to read the media, either by knowing someone with the correct kit, getting hold of a suitable access device, such as a floppy drive, a CD drive or a suitable card reader.

And of course, you need how to use them.

Which is why events like the Cambridge Festival of Floppies are important. Old buggers like me who have worked with digital preservation and file format conversion almost all their working lives, are either retired or getting close to it - after all 3.5" floppies dropped out of use roughly twenty five years ago and computers stopped coming with CD drives sometime in the early 2010s. And we won't mention Apple and the weird variable speed floppy drives in pre OS X macs.

So, somehow, the message needs to be passed on, which is why technology workshops are valuable. I might remember about how to cable up a floppy disk controller and access the media, but I'm not going to be around for ever, as are these super convenient USB based floppy drives you can find on ebay


Some day they're going to stop selling them as there's no profit in them, and anyway, no one makes 3.5" drives any more, meaning most of the external USB 3.5" drives you can buy are made using recycled components. (5.25" and these weird 3" drives used by some Amstrad word processors in the nineties in the UK, are another problem entirely - recycled 5.25" and 3" drives in working condition are almost impossible to find.)

Once you've recovered the files there's also the problem of file format.

For more recent content it's not really a problem - the use by digital cameras of JPEG format, and the dominance of Microsoft's file formats and Adobe's pdf have created a monoculture - if you can read the device you can almost certainly access the content.

And if you can't, both Libre Office and AbiWord between them support a wide range of legacy formats.

But that's by no means the whole problem. What do we do with the content once we have recovered it and have assured ourselves we can read it?

This is actually a live problem, up at the Athenaeum we are increasingly receiving donations of people's family history research material on removable media - almost all on USB sticks, although we do have a few CD's and external hard disk drives.

As we are a volunteer organisation with fairly minimal external funding we have the whole problem of being able to preserve the data long term, at least the format monoculture means that we are able to read the scanned letters and look at the old photographs without difficulty.

So, we can read the data, look after the media, and try and find a long term storage solution. And, while the content may be digital, it's mostly derived from non digital sources.

The future, of course, will be different.

As we know, no writes letters any more, and everyone's photographs are saved to the cloud somewhere, which makes will make the whole business of family history and biography rather more difficult.

In fact, there was an article on the ABC's website this morning bewailing the death of the biography, exactly because no one writes diaries or letters and of course there is the question of what happens to our digital content when we die.

What this means is that there is no assurance of long term access to digital content as it increasingly moves to the cloud. 

For people working in the field of family history this increasingly means that all their material is stored in the cloud. Even if it was originally in a non digital format it will have been scanned, indexed and stored.

If someone does some oral history work the recordings will be digital, as will any transcriptions. I could go on, but you get the picture.

Creating a portfolio of your work and writing it to a USB stick and giving it to a memory institution such as the Athenaeum is not a solution - we don't have a long term preservation solution of our own, and if we found somewhere to lodge the work, that somewhere will of course be dependent on external funding for the foreseeable future, and as we have seen with the failure of projects like the Florence Nightingale digitisation project to deliver, even funding does not guarantee either access or continuity of access...

Friday, 24 October 2025

Cataloguing removable media

 Up at the Athenaeum, people are increasingly donating USB sticks containing family history information. Usually as well as family trees, they contained scanned photographs and documents including birth death and marriage certificates as well as immigration records and picture pages from old passports.

All valuable stuff.

And it's not a rare ocurrence today we had two in the space of half an hour.

In some cases they come with some quite detailed documentation, with the best following all the guidelines as regards human readable filenames for directories and files and providing some descriptive information.

Others, perhaps less so, and we need to think about how we document them.

For the moment we need a little procedure to ensure that we catalogue and record the items in a standard way, so that we can keep the USB sticks safely, and make sure that the connection between any printed documentation and the USB is preserved.

After all, people have entrusted us to look after their family history research and it is the very least we can do is look after it for them in as professional a way as possible.

It has also revealed that we didn't actually have a procedure for managing donated electronic material, so I made one up.

As a procedure it owes something to the procedure we developed some years ago for ingesting field research data when I was at ANU, and people would bring us data that they wanted to archive, examples include species abundance data and digitised historical documents.

The difference here is that at the Athenaeum we have no content management solution - while the data may eventually end up in Victorian Collections or Trove at the moment our focus is simply on the safe storage of the donated data.

 When I wrote the procedure I had in mind the differing skill levels of our volunteers, so I tried to make it as mechanical as possible and not too different from the way we ingest data about aretfacts - the draft is available to download as a pdf.

The document is very much a work in progress, and may be subject to revision. In the meantime, please feel free to take a look, and reuse the content if it seems appropriate.




Sunday, 12 October 2025

Ipads versus Android tablets

 Just under a year ago, I bought myself a refurbished iPad as some applications had stopped working on my pandemic era Huawei MediaPad, basically due to it being stuck on an old version of Android.

I expected that over the course of this year I'd gradually change over to using the iPad exclusively, and the MediaPad would go to the ewaste centre.

A great pity, as it is an excellent device, but facts have to be faced, and Apple own the tablet space in Australia and Android devices are not even a minority sport.

However, due to my being a total gonk and failing to realise that if you buy a subscription to a news website through Google, in most cases you only get access to the Android app, I've kept on using the Huawei to read the news in the morning and check the weather.

This has given me an opportunity to compare both devices over the longer term.

Tablets don't really have to do much other than run an application, download and display content, so things like memory and processor power are not important - as long as they have enough to do the job in a timely manner it doesn't matter if one has a higher performance benchmark than the other.

In fact both are roughly the same age and roughly the same specification - the Huawei has a bit more memory - certainly you don't feel any significant difference in performance when using YouTube or Spotify.

Where you do see a difference is in switching between applications or indeed cutting and pasting content between the two.

The iPad is simply clunkier. It does the job, but it's clunkier, and I put this down to the fact that Android is inherently multi tasking, while older versions of iPadOs are not.

This isn't a showstopper by any means - if all you want is a device to review documents on or watch videos, you probably don't care that much.

Strangely, the one real differentiator is long term operating system support - Apple are still pushing out updates for a five year old device while the MediaPad has dropped off Huawei's update list.

So, if I was to go out and buy a replacement device today, which would it be?

A current model iPad brought from Apple in Australia is A$600, meanwhile the current Honor Pad is around A$550 bought from Amazon in Australia. (Since I bought my MediaPad, Huawei have both rebranded their phone and tablet business unit as Honor and sold it to another Chinese electronics manufacturer to avoid US sanctions on the Huawei parent company) 

Amazon also sell grey market imports of the previous Honor Pad, the 8a, for around A$250. The 8a is based around Android 14,  which is still supported.

Given the price advantage of the grey market import of the previous model, I think that's the one I would go for, if I wanted a new and competent device and didn't want to spend six hundred bucks on a tablet.

Refurbished Huawei and Honor devices are not really an option - you're unlikely to get any operating system updates. Refurbished iPads are competent, but more recent models attract a price premium meaning there's little advantage over buying new.

So, there we have it. As always your mileage may vary, especially depending exactly how you intend to use the device. What I would steer clear of are some of the remaindered Huawei branded mediapads floating around various online marketplaces - the supported operating systems are simply too old, even though the hardware is still good and performs well.

Friday, 10 October 2025

Guerilla cataloguing - part 0

 I've mentioned before that we planned to recatalogue the heritage book collection using LibraryThing, the heritage book collection being the contents of the Athenaeum when it functioned as the town library in Stanley.

As far as we can tell, they hardly ever deaccessioned anything giving us a picture of changing reading tastes from sometime around 1862 to 1971 when it ceased to function as a library.

Actually, I suspect tastes haven't changed much, given the number of early copies we have of novels by Louisa M Alcott, Mary Elizabeth Braddon, Wilkie Collins and the rest - clearly the nineteenth century subscribers to the library had same liking for mysteries and sensation novels as we do today.

Until we try it, we've no real idea how well recataloguing with LibraryThing and our proposed methdology is going to work.

To refine and document our procedures we are going to run a pilot project on a few shelves to see how well it works and if it works well, we'll turn it into a guerilla project where we basically just do it, and don't worry overmuch about deadlines or formal project plans.

There is an intention to try and get other people involved so we can turn the project round fairly quickly, so we do need a simple and robust set of procedures so we can bring people on board and get them up to speed - quite different from the documentation of Dow's and Lake View where there was only me and the main reason for documenting procedures was to avoid drift and capture any changes to the methdology.

So today it was part 0 of the exercise - creating an account on LibraryThing for the Athenaeum, and as part of what we want out of it is a set of MARC records to allow us to port the catalogued data to another library system, identify some tools for verifying and manipulating MARC records, especially as instead of class marks or any standard cataloguing scheme, the original spreadsheet used shelf position.

This is worse than it sounds - for the thirdmost book from the left on the front row of shelf C the shelfmark is C3F, and the thirdmost book from the left on the rear row the shelfmark is C3B. Unfortunately there's no guarantee that there are the same number of books in the front and back rows - as a scheme it's almost as eccentric as the Cotton Collection classification scheme.

So basically, we need to be able to validate the MARC output.

MARC is a binary format dating from the early days of library computing and, like BibTeX, is essentially a lowest common denominator format, ie one most other systems can read and process.

So, what we need is a utility that can read the binary MARC file and display the file in a human readable form - something that with MARC is a bit of an exaggeration.

Now, the last time I did any serious work with MARC was twenty years ago when I wrote a simple parser in Perl to take a set of MARC records and format them so that the records looked like old fashioned card catalogue images.

I forget why I was asked to do this, but I remember looking out at the rain coming down on the museum car park while I fiddled with regular expressions.

So we needed something to let us examine the contents of MARC files, and given that we have a budget of zero dollars and zero cents for this exercise it had to be both free and public domain.

Well, there's not a lot of choice - basically it seems to come down to Terry Reese's MarcEdit, which has the merit of being endorsed by the Library of Congress, FastMRCView, produced by the Russian State Library (formerly the Lenin Library) in Moscow, and the online only MRV MARC Record viewer.

Otherwise there don't seem to be a lot of options out there, but it's quite possible that I've missed a couple of other public domain applications, but playing with MARC seems to be very much a minority sport. 

I've decided, quite unilaterally to go with both MarcEdit and FastMRCView in the pilot and compare the output, while both seem to do what it says on the tin, there's always a risk that one application interprets the data slightly differently from another.

FastMRCView is a windows only application, while MarcEdit comes in Windows, Linux and OS X flavours. As most of the prior work on the catalogue has been done on windows there's no pressing need to change operating systems.

So, we have our account and some software that looks as if it might help with the gnarly stuff, all that remains is try and see is how well our proposed methodology works in practice ...

Wednesday, 8 October 2025

Changes to blogger.com

 Google appears to have decided that the way to improve the blogger experience is to add some AI generated stuff to it - see the following couple of screenshots


I'm doubtful about this, especially the latter option given that Google's AI enabled search often gets things wrong if it's an obscure topic (like the ones I post on).

As to what a 'Google experience' constitutes, I'm not sure.

For this reason I'm going to ignore these new options for the moment ...



 


Friday, 3 October 2025

Plant remains in heritage books

 Up at the Athenaeum today we had a little conundrum.

We had been donated a book dating from the early 1870s, which had been given as a Sunday School prize to a member of a local family still resident in the area.

The book's exact provenance is unknown - there are some markings in pencil that suggest that at some time it had been resold in a second hand book store, but there's no doubt about its origins - the original dedication is intact which gives the name of the recipient, the date, and the location - in this case Three Mile, a now vanished mining settlement on the outside of Beechworth.

Unfortunately, the book is a wreck. The spine's is broken, there are loose pages, possible insect damage, and foxing, and is probably not worth conserving, but might be worth retaining as is because of its local connection, especially as the family still live in the area.

But when I was leafing through it to check for damage I found this


someone had at some point put a small plant inside, possibly as a keepsake or a bookmark.

Now, if we decided to preserve the book because of its connection to a local family, rather than simply photographing the dedication on the fly leaf, what do we do about the plant remains?

Well I didn't know. Google was singularly useless, so I appealed to mastodon.

No one replied, but I had a brainwave.

When I was a much younger man, I had a girlfriend who was a field botanist.

When we went for a bushwalk, if she found an interesting plant she hadn't seen before she would take a sample, wrap it in a bit of newsprint and put it inside a field guide for later identification.

Putting it inside a field guide kept the sample flat and the newsprint absorbed the moisture (more or less).

Proper herbaria - reference collections of dried plants - are a little more elaborate, but not by much, with the plants being pressed flat on absorbent acid free paper and then transferred to a fresh sheet of archival paper and attached with archive quality paper tape and then stored in a sleeve or folder.

And this gave me an answer to my own question

1) transfer item to sheet of archival paper
2) secure in place with archival tape
3) photograph, document etc
4) fold paper to make a packet without damaging the item
5) place in archival storage box in labelled acid free or tyvek envelope

Given that the book is so damaged if we decided to retain it would probably make sense to tie it up with cotton tape and place in an archive box, in which case we would simply put the plant packet in the box along with the book, that way we keep the association between the two objects ...
 


Wednesday, 1 October 2025

Of internet speeds past

 This morning I tooted that our internet speed had jumped to about half a gig, something that is quite amazing in terms of infrastructure for rural Victoria. Admittedly it's only that fast on download, upload speeds are still comparatively slow


but basically fast enough that you really don't need to worry about speed and latency when moving data about. In fact, compared to my first year documenting Dow's when I would upload my days work, typically 70 or 80 jpegs and and a few spreadsheets, at home, I would flood our ADSL connection, it seems pretty magical.

Then, I couldn't actually upload the data at Dow's, the internet was simply too slow down there, so I ended up resorting to sneakernet and saving my work to a USB drive before uploading it at home.

At the time that our fast ADSL connection seemed fairly zippy, especially compared to our house in Canberra where our ADSL connection was incredibly slow and I ended up investing in a 3G router that was plugged into the ISP's modem.

The 3G router used a USB stick modem to connect to the internet, but could be configured to use the ethernet connection to the ISP's modem by default and only fail over to the 3G connection if our rather flaky connection over the old copper wire phone system went away - which it did every time in rained

The fact the phone cable went via our neighbour's apple tree probably didn't help much either..

Before then we had dialup over a 56k modem.

But that wasn't our first dialup internet.



Around 1990 or 1991 I bought a Global Village Teleport Bronze 2400 baud modem which I plugged into the back of my Mac Classic.

There was something quite magical then about being able to open a terminal session and log into the dialup gateway of the university where I then worked and check the health of servers, send emails, and upload and download documents to work on at home.

This was at a time before the worldwide web and text based systems such as gopher were as sophisticated as it got, and there were no real ISPs (in fact we had to shoot down a thought bubble from marketing about starting an ISP in the mid nineties, instead we used to suggest that people use the British Library's service which was a rebadged version of one of the big commercial ISPs.)

It was of course a simpler time.

Letters still came in the mail, and if you needed to order something you either sent the order in the mail, or if it was urgent, by fax, and the internet was really still just an academic plaything.

Contrast that to today, where the internet is essential to just about everything we do, as was shown in the case of Tonga when a volcanic eruption not only cut off the connection to the rest of the planet, but between the main island and outlying islands.

The loss of the internet was crippling, all the more so because the previous satellite based service had been abandoned because the new service was just so much better, and everything, and I mean everything went via the now broken undersea cable ...


Sunday, 14 September 2025

Using Acrobat's AI summaries with Trove

In my little bits of nineteenth century historical research I use digitised newspaper resources a lot. 

The various digitised resources I use most often are nineteenth century Scottish newspapers via the SLV's subscription to Gale Newsvault for family history stuff, The Times of London's archives again through the SLV, Welsh Newspapers Online, Papers Past NZ, and above all, the NLA's Trove.

Trove is undoubtedly a great resource, but the quality of the digitised text, to put it politely, is variable.

Trove does provide OCR's summaries of the articles, but the quality of the digitised text can make the OCR'd text read as if it had been transcribed by a Martian - strange combinations of letters and punctuation followed by gobbets of reasonable text.

So, for years, what I have done is use the download option to generate a pdf, download the pdf to an ipad, and then sit and make notes on a 'proper' computer.

Latterly, if the pdf is too hard on the human eyeball, I've used J's old iMac, which now runs Linux, and Okular to give me a bigger image at a decent resolution to work with, and that's worked pretty well as a workflow.

Now, as I'm sure you're aware if you're an Acrobat user, Acrobat now behaves like an enthusiastic puppy, always asking if you want it to generate a summary of the document.

I've tended to ignore it, really because most of the PDF documents I look at on windows are boring things like credit card and electricity account statements, and there's usually only two important bits of information - how much we owe and when is payment due.

But instead of  doing the majority of my work on a linux machine as I usually do, I researched the Panjdeh incident on my Windows machine, and typed my notes into Geany on the old Chromebook I installed Linux on, really as a way of assessing the usefulness of the converted Chromebook.

(Answer, very useful, and good battery life to boot).

Anyway, as I was working on Windows, Acrobat came along wagging its little tail,  offering to generate a summary of every pdf document I opened.

So, for a number of longer documents, including some with poor quality OCR'd text, I did.

And they were surprisingly good, and the AI summary engine seemed to deal reasonably well with poorer quality scanned text, producing reasonable and good quality précis of the article texts.

Obviously you need to check the text yourself, but using AI text summaries turned out to be a useful way of assessing if the article was worth reading, it's not the first time I've slogged through a report of court proceedings to find that the report didn't add anything to what I already knew.

It's by no means a panacea, but it's certainly a valuable tool...

Thursday, 11 September 2025

What happens to our photographs when we die?

 An interesting little question popped into my head - what happens to our digital photographs when we die?

Of course we've all wrung our hands about how letters and postcards have been replaced by email meaning that future generations have lost access to our correspondence, denying cultural historians access to sources that describe how people felt about things, but unless I'm very much mistaken, people's digital photographs have not really been thought about.

For example, and this shows the value of sometimes inconsequential seeming objects,I recently picked up a British World War One propaganda postcard from a postcard trading site. Transcribing it turned out to be interesting, with its hint of war weariness among the population as well as worries over the risk of German air raids.

Interesting, and something that one couldn't do about a contemporary conflict, such as that in Ukraine, because all the communication involved would be digital, and I don't see people collecting 100 year old WhatsApp messages they way they used to hang onto (and collect) old postcards.

Now obviously, one doesn't want to keep everything. Broadly speaking, there are two sorts of photographs in people's collections - the transitory and the significant.

The transitory are images like the cracked tail light on a rental car - you photograph it to show it was pre-existing damage, or the back of a wi-fi router to record the password.

Then there's the significant - examples being all my artefact photographs for the National Trust, photographs of old buildings, J's records of her artworks, and so on.

Once they would have been boxes of 35mm slides, and now they exist on a server somewhere.

And of course not everything physical survives - my geeky teenage photographs of closed railway stations in Scotland have gone to landfill in the course of various moves and relocations, along with pictures of former girlfriends, camping trips and the like.

Some of these may have had some value, some not.

And so with digital images, some have significance, for example some of my Trust photographs show the state of decay for some artefacts, and might be of value to future conservators, etc.

And obviously some work has been preserved - for example I know that some of my artefact photographs have been archived, but not all of them, and of course I don't know which ones.

And increasingly there is a problem.

People's collections of potentially archivable material are changing - emails have replaced paper, digital photographs have replaced analogue film, etc etc.

And of course, there's also the problem of obsolete media - recordings on cassette tape, video tapes and the rest, plus if they were digitised, where did  the digitised version end up, and how is it preserved?

Answers on a postcard?

Saturday, 6 September 2025

Multi factor authentication and the outback

 Australia is a big, really big, sprawling country, and as a consequence there's a lot of places you don't get mobile coverage.

Sometimes you can get a wifi connection because the local pub has satellite wifi.

If it's Starlink, it's usually not too bad, and wifi calling and text messages can get through. 

If however, it's the NBN's aging SkyMuster, or some other solution it can be too slow for wifi calling, and guess what, text messages sometimes don't arrive.

I'm talking seriously slow, the sort of speeds that make you long for character mode email and text based web browsing.

Really frustrating.

And of course you can't then complete the authentication process.

And Google's 'check your other device' solution can be just as bad, especially when you don't actually have your other device to hand, like it's a couple of hundred kilometres away.

The solution, of course, is to do all your set up somewhere with white lines and traffic lights before you go bush and making sure you click the 'remember me' box if there is one.

Of course, you don't always remember... 

Bunsen labs ditched

 I said I'd try Bunsen Labs Linux in a real world situation to do real work.

So I did.

Using Libre Office to review a document I started to get an annoying intermittent flicker - it could have been a latent hardware fault or it could be that the Radeon screen driver shipped with Bunsen Labs wasn't optimal for my hardware.

Well, only my pride was affected, I had very little work on the machine, so I wiped it and installed Ubuntu, remembering to click the third party drivers box.

I deliberately chose Ubuntu as they have particularly good support for Lenovo machines.

Well, changing operating systems seems to have cured the flicker problem (maybe).

It's certainly better but it does come back occasionally. The only thing to do is try it for some time and see if it is just as bad with Ubuntu as it was with Bunsen Labs.

(AMD also provide Radeon drivers for Ubuntu, and if the flicker comes back, I might well give these a go. Unfortunately, they don’t provide generic Debian drivers, and Bunsen Labs is based on generic Debian but there is a wiki page on AMD Radeon on Linux).

Bit of a pity, because I quite liked Bunsen Labs, but to be fair they did warn you on install it was a hobbyist supported distro, and that there might be problems ahead.

While I'm obviously disappointed, it won't stop me from trying Bunsen Labs again on other hardware...

[update 09/09/2025]

Well, I was still getting an intermittent flicker with Ubuntu 24, so I did a little digging.

lspci was correctly showing the graphics card to be an AMD Radeon, but as I still occasionally got a flicker, so nothing ventured I downloaded the latest AMD driver for the hell of it


The lspci output after downloading didn't seem to be a lot different, so I tried doing what I had been doing on Bunsen Labs when I got the annoying flicker and started paging through the document with Libre Office.

Well, on the basis of a sixty second test I'm not getting the flicker.

I'll work some more with it to make sure it's really gone ...


Monday, 18 August 2025

Bunsen Labs in use

 I was impressed by by Bunsen Labs Linux running on a VM, so much so I decided to try it on a real machine.

The only machine I had to hand was my old desk laptop that had been gathering dust for about nine  months since I'd upgraded to Windows 11  - not by choice but sometimes you have to stay compatible with the world.

Anyway, long story short, it's an AMD Ryzen based Lenovo laptop, and even now after some six years of use, a pretty meaty machine.

Installation was easy, it just flew and it gave me a working system in around 45 minutes. I probably spent longer trying to get it to boot from install USB. (Just to be different AMD based laptop didn't use a magic function key combination to get into the boot menu. instead you needed to use a sim ejector (or a bent paperclip) to poke a special recessed magic button on the case when the machine was powered off. This caused the machine to boot into the boot menu - and yes I did have to track down the manual online to find this out)


In use, and I havn't used it seriously as yet, it's quite impressive, and pretty capable.

Memory and cpu use is minimal as is disk use, and the machine simply feels fast. My plan is to use it as an alternative to the Windows 11 laptop on my desk and see how it compares, as well as using for a couple of projects ...


Sunday, 17 August 2025

Bunsen Labs Linux

 For the last twenty or so years I've been reusing old computer hardware for various of my projects, something that has invariably involved installing Linux as often software bloat on both Windows and OS X has reduced the usefulness of the hardware (and which is why I've been able to pick up some pretty good machines for not a lot from hardware recyclers and refurbishers.)

I've played with quite a few distributions over the years, but these days  the two I feel most comfortable with are Crunchbang++ and Ubuntu.

Crunchbang ++ I tend to install on resource limited hardware - which is why I used it when installing Linux on a Chromebook, and Ubuntu on anything else.

Crunchbang started out as a custom Linux distribution designed to use fewer resources than most mainstream distributions.

Development of the original project halted in 2015, but it spawned two successor projects, Crunchbang++ and BunsenLabs linux.

For a long time both projects were very similar as regards installation and the user experience and I did run BunsenLabs linux on an old netbook for a number of years, but for the last few years Crunchbang++ has been my go to lightweight distribution.

However, when I was working out what I could use in my Linux on Chromebook project I came across quite a few reviews that mentioned Boron, the latest Bunsenlabs distribution as being quite slick and resource efficient, though not quite as minimal in its disk usage as Crunchbang++.

So, I thought I'd take a look, and this morning I built a BunsenLabs VM using VirtualBox on my Dell Latitude.

Like CrunchBang++ installation was via the standard Debian installer and once booted and logged in you are presented with a customised OpenBox desktop not that different from the standard Crunchbang desktop, albeit in a nice blue green shade and with the time and connection status on the bottom left rather than on the top right


Like Crunchbang there is an option to install additional software


but unlike with Crunchbang++ AbiWord and Gnumeric are not installed and there's no option to skip the installation of LibreOffice, and to be fair, if you have LibreOffice there's no real need to install AbiWord and Gnumeric.

Now, when I installed Crunchbang++ on my old Chromebook, I deliberately went for AbiWord and Gnumeric rather than LibreOffice in the expectation that I would save a bit of disk space - remember that the Asus C202 Chromebook only had 16GB of eMMC storage - so what is the disk usage under Bunsen Labs?



and it's not that bad  - around 6GB, about the same Crunchbang++ without LibreOffice


making BunsenLabs a realistic option on resource constrained hardware.

Personally, I'm comfortable with CrunchBang++ and in no hurry to change, but I certainly would be happy to suggest BunsenLabs as an alternative to other lightweight distributions such as Lubuntu, especially in a situation where the user experience was important - the current BunsenLabs desktop feels a little more slick and modern that the current Lubuntu or Crunchbang++ desktops...



Wednesday, 6 August 2025

Linux on an old Chromebook

 Some time ago, for what seemed entirely sensible reasons, I bought myself an old Chromebook.


In practice, it has turned out not to be quite as useful as it might be. However, the screen’s in good condition and the keyboard is nice to type on, so I wondered, could I install Linux on it?


When I say install, I mean replace ChromeOS (which is a barebones quasi-linux under the hood) with another version of Linux entirely.


The machine is an Asus C202SA, which means it comes with an Intel Celeron N3060 processor, 16GB of eMMC storage and 4GB of RAM. Not the fastest device on the planet, but by no means the slowest.


The original Linux based EEEPc 701SD had an even slower Celeron processor, only half as much in the way of storage, and far less in the way of RAM – a measly 512MB – but I successfully installed Crunchbang Linux on it way back in 2014.


Using my previous experience with the Eee, I reckoned that it should be able to  run a current distribution of Crunchbang++ successfully.


The Crunchbang install image size is typically a little less than 6GB, and when idle it only uses around 512MB RAM so it should work. The install process has a breakpoint to avoid installing large applications such as LibreOffice, so with a bit of luck minimal install should be smaller than the typical 6GB.


A web browser, a lighter weight word processor such as AbiWord, a text editor and a lightweight spreadsheet such as Gnumeric should give me most of the functionality I’d need.


So, how to install?


Chromebooks are designed to run ChromeOS and have a number of features to prevent people installing alternative operating systems.


However, for a few years there was a project, GalliumOS, to develop an alternative to ChromeOS for Chromebooks.


The project’s now been discontinued, but the project wiki has a wealth of information about installing alternative operating systems on older ChromeBooks.


In the case of my Asus, you need to replace the startup firmware (the BIOS if you are old school), with an alternative firmware image. 


Chromebooks typically have a write protect setting on the firmware and this needs to be disabled.


Mr ChromeBook Tech supply replacement firmware for Chromebooks and have a pretty comprehensive list of models and how to disable write protection.


In the case of ‘my’ Chromebook it comes up with





meaning that you need to crack the case and remove a screw from the motherboard.


Fortunately the C202 and variants are designed for easy repair, and opening up the machine is straightforward with no nasty glue or anything like that involved, and there are a number of videos on YouTube, mostly featuring intense young men explaining exactly how to take one apart.


So, first things first.


It’s a 64-bit machine so I downloaded the latest 64bit ISO image of Crunchbang ++ (aka Cbpp), and using Rufus, made a  dd style boot volume. The latest image is only available as a torrent, meaning I needed to install µtorrent to download the image.


µtorrent is a paid for application these days, but there is still a basic free version, but you need to be resolute and ensure you select the free version, which comes with some mildly annoying ads.


Then the first slightly scary bit – cracking the case and then using a prying tool (a standard mobile phone and case separator to separate the two halves of the case. Mine came from ebay for less than five bucks.)



and then it was simply a matter of removing the write protect screw - helpfully marked with a big arrow, putting the box back together and following the instructions about getting into developer mode, and downloading the firmware update script


and executing it


Once the firmware had been flashed it was simply a matter of rebooting and running the install script.

There were a couple of  oddities during the install process - despite being the standard Debian 12 graphical installer and very standard hardware the mousepad didn't work, and more alarmingly, the first time around the disk partitioner didn't work.

In the latter case, I backed out and rebooted the machine and reran the install routine, and this time the disk, well a 16GB eMMC unit partitioned properly.

After the installation script had completed the machine rebooted and after logging in I was greeted by the standard Crunchbang updates and additional software screen


as planned I ran the software updates, but didn't install either Libre Office and other optional software to save disk space.

I then shut it down, powered it back up and checked that everything was normal and that the mousepad worked.

Everything looked good so here's a final image of the machine with AbiWord open


I'm quietly pleased with the result - I now have a fairly tough Linux laptop that, being based or hardware designed for the education market place, should stand a reasonable amount of abuse and have half decent battery life.

Installed, Crunchbang++ and the minimal application set takes up a bit less than 6GB - not quite as good as I hoped but something I can certainly live with, as it gives me roughly another 6GB free space plus something for swap.

For comparison, my two other Crunchbang++ machines which have a full software install including LibreOffice and a few extra programs such as Focuswriter and Notable come out closer to 13GB, but then they are not so constrained for disk space, both having 128GB SSD’s.

I am no technical genius, the last time I played seriously with hardware and firmware was over twenty years ago, so while I had the skills to open up the machine and remove the write protect screw, and some understanding of what was going on when I flashed the UEFI firmware, to a large extent I was simply following the bouncing ball.

Standing on the shoulders of giants I think it's called and I couldn't have done this without some very clever people making their work freely available.

While this might not be for everyone, given the right hardware, the actual installation of Linux was no more difficult than on a standard laptop, and it certainly got me out of the 'no more updates' Chromebook trap ...