Friday, 4 May 2018

another non isbn

I've been reading Amelia Edwards  A thousand Miles up the Nile, and I came to the realization that I really needed a copy of Murray's guide to Upper Egypt to make sense of both her's and other nineteenth century travellers' accounts of their travels up the  Nile.

Just as today we tend to follow our Lonely Planet or Footprint guides, they followed Murray's.

It turned out that the best and cheapest way to get a copy was to buy a reprint from one of these Indian print on demand shops that pop up on AbeBooks, so I duly ordered a copy.

It of course took about three months to arrive, during which time customs had sequestered it for examination to make sure it wasn't suspicious in any way, but it arrived, and in one piece:


and strangely, it again had a number that looked like an isbn on the cover. So I tried it in isbnsearch.org:

and again, an apparently valid, but unregistered isbn.

Which makes me wonder if there is some software out there that's being used by these overseas print on demand shops to generate fake isbn's ...

[update 05/05/2018]

and in fact, ten minutes with Dr Google has turned up a number of generators such as GeneratePlus


which when pasted into isbnsearch gives the following result


There's also a number of code recipes out there, including one for making fake isbn's for pre isbn books so you can put them into a library management system and track them as assets ...



Wednesday, 2 May 2018

Tracing patent medicine bottles with ebay

If you're interested in nineteenth century trade patterns nineteenth century patent medicine bottles are a godsend.

They're usually readily identifiable, often embossed with the manufacturers name or logo, and what's more, durable, attractive and collectable, which means that they turn up for sale on sites like ebay and gumtree, as well as more specialist bottle collecting sites.

So by treating ebay as a research resource, you can readily work out the rough distribution of the bottles - for example in both the cases of Hayman's balsam of horehound and Jacob Hulle, ebay gave me a rough spread of the bottles.

But of course it's not perfect. For a start there's no real provenance. We don't know where the bottles were found, or indeed under what circumstances. We don't know a date, even a hand waving one based on other items found with the bottles.

For example, with Jacob Hulle, we know that the company was operational for roughly ten to fifteen years, which can give us a rough date.

For Hayman's it's a bit more difficult. I've found adverts as early as 1861 and as late as 1895, which probably brackets the lifespan of the product. There's no real way of dating anything more exactly.

And of course, I'm making a big assumption, that the vendor is located close to where the bottle is found. It may of course been traded on at a collector's fair and have been found several hundred kilometres from the vendors location.

Now in the case of Australia, Wales and New Zealand the online collections of digitised newspapers are our friend, they give us clues as to where he was advertising, much as we can see that Hollway's pills were advertised extensively in the goldfields as in this example from the Ovens and Murray advertiser from 1869:

But then we turn to places such as South Africa.

I'd really like to know if the Hayman's bottles turn up exclusively in the former Cape Colony, or if they were rather more widespread. And of course one can't as there's no real provenance information ...

Tuesday, 1 May 2018

Is trove difficult to use ?

Yesterday, I went to a collections use workshop organised on behalf of Collections Victoria.

Essentially they want to find out how people are using their services and to that end have organised a series of workshops with people working on local history projects, and yours truly was one of them.

I won't go into what we said or did, because I don't think it was particularly remarkable but one key takeaway is that a lot of the participants said that they found trove, the NLA's digital resources website difficult to use.

I didn't have time to drill into what was difficult to use, but I got the feeling that a lot of people felt that they did not know how to search effectively online.

Most of those involved were reasonably competent local historians, used to dealing with books and paper archives but being in their sixties appear to have missed out on digital skills training or simply didn't have the opportunity in their past professional lives.

Seems like there might be a training need. And possibly one not to difficult to fill, after all most university libraries have been running online search skills courses for years ....

Friday, 20 April 2018

TextWrangler is end of life, and why I care

For those of you unfamiliar with the product, TextWrangler is a very nice language aware text editor for OS X from the same people who produce BBEdit.

Over the years I've used in mainly to write MarkDown, raw HTML and Perl, and it's done my proud. The folks at BBEdit have now decided to cease development of TextWrangler, and encourage people to move to BBEdit, although existing TextWrangler installations will continue to work provided you don't upgrade to the latest version of OS X (now Mac OS).

Essentially if you move to BBEdit, you get a thirty day free trial of the paid for product after which time you can say 'No thanks' and dropdown from the paid for product to a free version BBedit Lite, which has all the features currently in TextWrangler.

BBedit don'd publish product roadmaps, so we can't say with certainty what's the future of BBEdit Lite, but it's probably fair enough to assume that it'll be around for a few years.

Unfortunate, but that's life. It's their product, and they can do what they like with it.

Personally, I find that these days I'm increasingly going back to the Windows platform, so I'll probably not be that inconvenienced by its demise.

However, over the years, I've helped several citizen science, local history, and other community projects get going, be it counting bugs (real bugs ones with six legs) or transcribing old records.

These projects usually struggle to buy a box of teabags and a pack of MacVities digestives, and this is usually where I get involved.

These projects are often very reliant on volunteer labour and have next to no budget for anything. Basically what I do is try and get their recording methodology in place and help them get software installed.

Often they acquire what IT equipment they have through donations - old iMacs from dentist's surgeries, local library system cast offs, or PC's donated via a bank's community programme.

Now the people involved in these projects are often highly skilled in their specialisation, but they're not really into digital archiving or indeed IT generally.

So, when helping them get going I've tended to emphasise open products with open file formats so that the data can be imported into something else later with a minimum of effort. At the same time I've usually encouraged people to use text as a format for working notes and records because of it's clarity and simplicity.

And where possible, I've tried to leave them in a situation where they can be self supporting with simple products that it doesn't matter too much if they don't upgrade.

Now, remember these iMacs from the dentists surgery (and others from other places).

Over the last 10 or 12 years I've been recommending TextWrangler to my Mac users, because (a) it was rock solid, and (b) free. It's running on old machines, many of which will never, or can never, be upgraded to the latest version of OS X.

That shouldn't be a problem, except that TextWrangler now tells you it's end of life when it checks for updates, and this confuses people. They think they have to upgrade, even when they don't, and the  whole 'try before you buy' thing confuses them even more.

And that's creating a support problem. Like I said, unfortunate.

Monday, 16 April 2018

Standard Notes

About a month ago, I bought myself an old ThinkPad as a stopgap replacement computer and installed Standard Notes on it.

Since then I've played with standard notes as a note taking application.

Just to be clear, I use OneNote and Evernote to manage documents, be they scanned nineteenth century newspaper extracts, household bills, or useful web pages. On the whole I don't use them to manage working notes.

These I usually simply write up in Markdown - my markdown documents are more of an expanded dot point list rather than a complex document with embedded images and links - using an editor such as Kate and save them with a filename starting with the date and something sensible.

Probably I ought to use something a bit more structured to group documents together rather than a self documenting file structure, but then I've survived forty years on the fringes of academia working that way.

Cherrytree, about which I blogged some time ago, would be a suitable tool, especially as you can locate the .ctb file on Dropbox, OneDrive or what have you to share between machines (and incidentally provide a backup of sorts).

The only concern is that CherryTree is basically a one person project, which has long term support implications, while Standard Notes is owned by a small company and possibly a better option for long term support. Basically if you need to do due diligence on your software tools as part of a project, Standard Notes would probably come out ahead on longevity and risk.

So I had a play with Standard Notes. Out of the box it's fairly sparse, you need a subscription to unlock the clever edits, saving to OneDrive, and other nice features.

Featurewise the basic version is much of a muchness with CherryTree. Given the way I work there's effectively no difference in functionality.

The lack of a native markdown editor in the basic version isn't really a problem, as I've said I usually type up my drafts in an editor to create a file a little like this wiki example. As Markdown is fundamentally a text file it's easy enough to cut and paste the markdown text into the standard notes application to make a new text note.

For me, as the idea of using markdown is to improve readability (basically all I use is indenting and section titling) pasting the fie as a text file works fine. If you do some clever things in your note taking, this probably won't work for you.

So, it's a competent product. Out of the box it has some restrictions and limitations, and if you want a full featured note management application, you might want to look elsewhere. As an application for managing working notes in text format it's fine. It does everything that you would expect.

And, unlike its big brothers, it's available for linux.

I would however like to have a 'try before you buy' evaluation mode for the various extensions to be able to explore its capabilities more fully.

But, if you need a competent note taker and management application for text based notes, standard notes might well do the job, especially if you are a linux user ...

Saturday, 17 March 2018

And now, an old Thinkpad

My old Dell Inspiron that I've used since 2010 is finally reaching the stage where it's running out of puff, not disastrously, but getting to the stage where one would start to think about replacing it.

Of course I could just put up with it's huffing and puffing and use my MacBook Air as a day to day machine, but I'd reached the stage where a new Windows machine seemed like a necessity.

The only problem was that we'd just paid out for a trip to Borneo, and I really couldn't justify the extra money right now.

Well, I'd always half planned to buy myself an ex-lease Thinkpad for Linux work, so after a little bit of agonising I bought myself a X230 with Windows 7 professional license, reasoning that I could use it as a windows machine, and perhaps even convert it to dual boot - Windows 7 and Linux, for the simple reason that the documentation project I've volunteered for is built around windows, meaning I need One Note, and that most of the rest of my personal notes and documentation is in Evernote, which again is not available for Linux.

So I paid my two hundred and thirty bucks to one of these companies that refurbish ex-lease machines and a few days later it arrived, beautifully packed in shock absorbing packaging and with a refurbisher's test report, and nicely imaged with a clean copy of Windows 7 with an install of Open Office 3 thrown in.

Out of the box, battery life was better than my Air, which realistically manages about two hours work these days between charges. The Thinkpad claimed a realistic four hours thirty out of the box and the what's more the battery is easy to replace down the track if needs be.

So, installing things.

First off were the standard utilities that I use:

  • Focuswriter - for distraction free writing
  • Kate - for when only a text editor would do
  • Open live writer - an open source clone of windows live writer for bloggin
  • Tweeten - a desktop twitter client
  • Gnumeric - spreadsheet for data manipulation
  • Libre Office - when I need to write something and format it nicely
  • Thunderbird - for email and calendaring
  • Texts - for wysiwyg markdown editing


And then it was the data intensive things

  • Dropbox - for data sharing
  • One Drive - cloudy filestore
  • One Note - Microsoft's note management tool which has all my project documentation
  • Evernote - which basically contains my entire life, invoices, bills, research notes and so on


All in all, close to 40GB of data to download, which took around a day with a few timeouts when we wanted to watch the morning news on iView, or actually use the internet. Given that our internet and phone plan has a stupidly large cap ( a terabyte of data per 28 days - effectively it's unlimited, we usually only use around 10% of it) I wan't worried by the download size..

Also given my general interest in note taking applications I installed Standard Notes for fun, and I should probably also install the windows version of CherryTree given that I waxed lyrical about it a few months ago.

It might have been quicker, but windows also wanted to download a zillion patches (actually a little over 200) and apply them, all of which took time out of the process.

At the end of it I've a machine with reasonable battery life, a decent form factor for working on the train and these silly little tables.

I havn't installed virtual box yet, but I'm planning to do so to build a virtual machine to put together a prototype Omeka site to showcase the project so far.

Sooner of later I'll probably add a couple of extra applications - such as the Gramps family history tool.

The only Linux software I really need is tesseract and cuneiform for OCR work on pdfs from old printed documents and they'll run equally well in a Linux VM.

So, next steps.

Basically use it, and keep the old Inspiron for backing up data from documentation project.

I do face a decision down the track as to whether I keep the machine, or migrate it to linux as originally intended. If I keep the machine I probably need to think about an upgrade to Windows 10, but for the moment seven is good enough. After all it's the software base that's important, not the operating system...

[update 18/03/2018]

... and this morning I was looking up some references and discovered I'd totally forgotten to install my preferred refernce maneger - Zotero, doh!

Wednesday, 7 March 2018

Lecture recordings and intellectual property

There's a strike over pensions in the UK university system at the moment and it's brought to light an interesting little argument over intellectual property.

Obviously, if a lecturer is on strike, a scheduled lecture is not going to be delivered. Some universities have tried to force lecturers to deliver cancelled lectures once they return to work, or persuade non striking colleagues to deliver them with varied degrees of success.

But some have tried a different tack, giving students access to the lecture recording of the previous years lecture.

Lecture recordings vary. Some simply record voice or else voice and video somewhat in the style of nineteen seventies Open University recordings. Others record voice and the accompanying powerpoint slides.

Lecture recording systems are usually touted as a revision aid for students, or else as allowing students at multi site institutions access to material delivered at another location. Cynically, it allows students who discover Statistics 1B is timetabled for 0830 on a Monday an extra hour in bed.

Individual universities rules on intellectual property and lecture content all differ slightly.

In many cases they were drawn up  some years ago before lecture capture systems were in widespread use, and before the world went digital.

In some cases the university owns the teaching material, some cases the individual owns it, and in some cases the lecture is owned by the university, but handouts, including the powerpoint slides, are owned by the individual.

And some lecture recording products have terms of use that require consent by the lecturer before the material can be reused. And of course there's the case where a teaching assistant delivers a lecture using existing notes and material when the lecturer whose course it is is on sabbatical. We won't talk about MOOCs here, but that's another problem, especially if material from other lecture courses is reused.

Basically it's a very grey area. In fact once you start to poke into it it's a complete nightmare ...


Friday, 2 March 2018

When an isbn isn't really an isbn...

Now we all know that isbn's are persistent identifiers par excellence, but I recently came across a case where they weren't

I'd bought a version of Valentine Baker's Clouds in the East, the book he wrote while imprisoned for his assault on Miss Dickinson, as part of my reading about the Great Game.

I'd bought the reprint from one of these Indian print on demand companies that reprint out of print out of copyright nineteenth century books.

Unlike some of these reprints this one came nicely bound with a card, as opposed to paper, cover and had a barcode, an isbn, and a suggested price in both Indian Rupees and US dollars, in other words rather than print on demand it looked like one of batch produced for retail sale.

So I entered it into LibraryThing - no such ISBN. Now I know from past experience of having bought books from India that they usually in Amazon's database, so I was a little surprised.

I tried Amazon India directly - no such luck. Neither was it on isbnsearch.org or barcodelookup.com, so I'm guessing it's an invalid isbn generated by the publisher when packaging the book up to make it look like a 'proper' retail copy.

Strange, hadn't come across that before ...

Wednesday, 28 February 2018

Working in a really small public library ...

Today on the documentation project I was chased out for an Hour or so when we had a large tour group come through.

It was too early for a coffee, and I had some notes to write up, so what to do?

It wasn't worth going home, but then I had a brainwave. I googled local library, and discovered there was a local branch library in the town, and it was (a) open and (b) had wifi.

I've written before about working in larger public libraries, but this wasn't the case here - this library was basically a largeish room like a conference room in an old local government building and didn't really have much in the way of workspace provision - just a couple of comfy chairs and an old table with a couple of desktop computers.

But they did have wifi - which claimed to be 5G, and had been moved from some other library as it still had the old name as the SSID, but it worked, or at least it did once I asked the library staff to reset the router for me - apparently a known problem.

I don't actually know what they were using, from my view of it it was a standard looking wifi router -  how it was connected I don't know but I guess over whatever infrastructure the local library corporation provides - I'd guess ADSL.

As a working experience,  it was really good - quiet, I could get my work done, no background clamour of coffee making and enough space to sit comfortably if unergonomically with my laptop on my knees, and notebook on the chair adjoining.

So next time you need a place to sit and use wifi, think about (and support) your local public library!

Tuesday, 20 February 2018

Provenance - it's all about provenance

Six months ago, on a plane between Singapore and Melbourne, I watched a remarkable documentary about the attempt by the city of Detroit to sell off the contents of its art museum to defray the city's debts.

The scheme ultimately foundered - because of provenance.

One of the original founders of the museum apparently used to go on collecting tours of Europe, buying paintings from cash strapped aristocrats who had lost everything in the first world war - so you would think it would be easy to work out provenance.

But, no. The person in question was an art collector in his own right, and while he would sometimes use the museum's money to pay for items, sometimes he would use his own, and sometimes he would 'sell' a piece to the museum at below cost and claim it as a tax loss.

And the records were a complete mess.

It wasn't clear which had been bought on behalf of the museum, which were on loan, and which were donations - and of course the more saleable paintings' records were as confused as the less valuable.

In this case having an unclear provenance worked for the museum - they couldn't sell what wasn't theirs, and the didn't know what wasn't theirs.

And I suspect that this is the case with a lot of museums who developed their collections in the late nineteenth and early twentieth century - the documentation is quite unclear.

Not for everything of course, for example the Elgin Marbles have a clear provenance and the case really depends on the legality or otherwise of Elgin's actions and whether his firman from the Ottoman governor really gave him permission.

But then we have cases like the Nizam of Hyderabad's mummy, which I blogged about back in 2015, where provenance is unclear, we know he bought it, but not if it was illegally acquired. Likewise in Amelia Edwards' account of her trip up the Nile in the 1870's, she recounts the story of the tourists who bought a mummy at vast expense, and after a week or so found that they could not stand the sweet odour emanating from it, and (literally) jettisoned their losses by throwing it in the Nile.

And this all makes the problem of artefacts acquired during the period of European colonialism.

Were the items acquired legally, were they acquired under duress or what.

And of course rules change. Egypt, for example started to license archaeological digs quite early and had clear rules about both documentation and ownership - basically that the more significant items were automatically property of the Egyptian department of Antiquities, which is why so much of the material is in the Egyptian Museum in Cairo, but as we know, there are significant collections elsewhere, and as the Nizam of Hyderabad case shows us, the system was not perfect.

Other countries, especially those under colonial rule, were not so strict.

And for this reason, probably one thing that should be done is to digitise the museum records and correspondence, as well as that of individual archaeologists and collectors, to both settle the question of provenance, but also to provide an unrivalled insight into the history of archaeology, and it's relationship to the antiquities trade in the nineteenth century ...

Sunday, 21 January 2018

Lenovo Ideapad K1 six years on ...

Yesterday was ferociously hot, so I did what I usually do when it’s too hot for gardening, and played with some old hardware, this time J’s old Lenovo IdeaPad K1, an android tablet dating from late 2011.

In its day it was pretty slick, slicker than the zPad, and a pretty nice bit of kit with an excellent screen - being an artist J spends a lot of time looking at pictures and illustrations - but it was a bit heavy to hold, and even though we'd invested in stand cum charging station for it, it could be a pain to use for extended periods. Not only that, it would occasionally lose its network connection, or more accurately not recover gracefully when our router flipped from adsl to the backup 3G connection, so eventually it was replaced by a Samsung Galaxy.

By the time it was replaced, Lenovo had more or less abandoned the K1, but had unusually, provided an option to upgrade it to an unsupported version of Android 4 - the K1 having originally shipped with 3.2.

We never followed that up at the time, as the only thing I used it for was downloading podcasts, and gPodder was happy with things as they were.

In retrospect, this was probably not such a good idea, as the links to the generic version have now (understandably) disappeared off of Lenovo’s website.

So, what can you do with 3.2?

Well, no modern browser, but Opera mini installs and runs quite nicely.

The previously installed wikipedia, gmail and twitter apps still work as does inoreader - an rss feed reader. You can’t, of course install anything recent, which means no decent text editor or anything like that.

But, given that most of what I use my  current tablet for is wikipedia, email and twitter, plus a bit of rss feed reading it isn’t a disaster. Not having access to OneNote or Evernote is a bit of a pain, but were my existing tablet to unexpectedly come to a bad end it would be good enough for a stopgap, which isn’t too bad for a device over six years old running an old operating system ...

Friday, 5 January 2018

Transcribing a blot

One of the tasks in documenting artifacts as part of the project is transcribing labels on the bottles of materia medica in the pharmacy.

Mostly this is fairly straightforward - the labels are on the whole beautifully stencilled in india ink on good quality paper, and so while they may be a little yellowed they're perfectly legible. It's the early twentieth century ones that are more of a problem - cheaper paper and sloppilly writen in faded fountain pen ink.

To be sure they have their peculiarities - the extensive use of Æ  in nineteenth century pharmaceutical latin and outdated abbreviations like TṚ for tincture, but it's all fairly straightforward.

Until a couple of days ago, when I came across the following


where the label had been corrected at a later date - if you look carefully you can see what appears to be an extra L which has been blotted out in a different thinner ink. presumably at a later date.

This of course raises an number of questions about transcribing the label - should I transcribe the label as it was meant to be read, or include the blot, or transcribe it as the original text and note that the first L had been blotted out at (presumably) a later date.

I decided to go for the middle route and transcribe the label as you would read it today, blot and all.

While I knew about the Text Encoding Initiative and the Leiden Epigraphy conventions, which I'm using to indicate missing or illegible characters, I didn't know about blots.

My first thought was to simply insert a unicode blot symbol, except there isn't one - as a stopgap until I could spend more time with Google I decided to use the cyrillic Zhe (Ж) as


  • there was no cyrillic text involved in the pharmacy anywhere
  • it sort of looked like the H^HZ^HN sequence we used to use in Wordstar days to generate a cursor symbol on daisywheel printers when doing documentation
  • having learned to read and write Russian I could write it with a degree of fluidity
I guess I could have used the unicode block character ( █ ) but as I also keep a longhand paper workbook in parallel with the transcription spreadsheet Ж seemed a better choice.

I started off by searching for things like 'epigraphy blot' without much success - well I guess stone inscriptions don't have blots, although they do have erasures, so I don't think it was that silly a search. 

Changing the search terms to something like 'TEI transcription blot' was more useful and produced a lot of information on how to represent blots in XML as well as important questions such as whether it was a correction by the author or a correction at a later date and differentiating between the two, as well as what to do if you weren't sure.

The only problem was all this information was for creating XML markup, and I was transcribing the labels to an excel spreadsheet using unicode, and I needed a standard pre-XML way of doing this that was going to be intelligible to someone else.

In the end I found the answer in the epidoc documentation maintained by Stoa.org. Under erased and lost  it not only documented the TEI XML but also referenced previous pre XML paper technology conventions, in this case [[[...]]], which was ideal.

This little journey has raised a whole lot of questions, including should we be using TEI XML encoding for the labels.

The short answer is probably not, unicode in excel plus some standard notation is more than adequate in 99.9% of cases, and the whole majestic edifice that is TEI seems like complete overkill, but certainly this little diversion shows the importance of discussing and agreeing on transcription standards before starting on something as seemingly straightforward as a sequence on nineteenth century materia medica labels ...