Wednesday, 22 May 2019

The joy of bibtex ...

The project's been chugging along nicely, and I've nearly finished documenting the dispensary and the back shop - we originally thought that there would be around 4000 items in total, but I've already documented around three and a half thousand, and there's still the shop to do.

Recently, one of the groups of items documented was a set of reference books - pharmocopaeias mainly, the earliest from 1914, the latest from 1963.

Too early to have ISBN's, and some different editions of the same pharmacopaeia.

So, how to document them and provide a unique reference, and preferably one that was machine readable?

BibTex!

All the books, and the correct editions, were on the National Library of Australia's catalogue which provdes a handy download of the BibTex reference, which gives us a professionally compiled description of the item, plus a catalogue reference to the NLA's catalogue to allow someone in the future to do a simple double check.

The one exception was a book which I couldn't find in the NLA, or any of the state libraries in Australia, but did find in the British Library, which unfortunately doesn't provide a handy citation export in BibTex format.

I could, I suppose, have downloaded the citation in the BL's preferred format and run it through one of the Endnote to BibTex, or Marc to BibTex conversion tools. but as it was only one entry, downloading, installling, and then checking the output seemed almost as much work as creating an entry by hand, so I ended up hand creating an entry based on the BL's RIS output.

And why BibTex?

Two reasons: (1) it's a common well documented format and (2) as well as being machine readable, its also human readable - more or less - which makes it easy for any future researcher or archivist using the data I've created to be sure that it was this edition and not that edition ...

Wednesday, 8 May 2019

recovering data from garages

Earlier today a former colleague retweeted this:


and strangely, I've been here before.

When I was managing the ANU's various ANDS funded data capture projects we made use of company in Perth that specialised in reading old tapes - in particular for the mining industry, but they would read anything - for a fee of course.

As part of the DC7A project, we used this company to read seismological data that was locked away on piles of DAT tapes that no one could read any more, due to no one on campus having suitable hardware.

As is the way in universities,  a researcher in social sciences, who worked in PNG heard of this.

He'd recently found some old 9 track tapes in a colleague's garage, and he recognised them as likely to hold a copy of some data from the PNG government. More importantly he thought that it might be data that the PNG government had lost as a result of a hardware failure.

Details were sketchy, there were some paper labels that identified them as 9track ascii tapes, but that was about it.

Any way I talked to our data recovery company and they were happy to give it a go.

Fortunately, despite languishing in a Canberra garage, the tapes were readable and were in a straight forward comma delimited format, rather than some old proprietary compressed data format, or some strange format used by some now forgotten data manipulation software.

So the data was recoverable and could be returned to the PNG government.

Now I'm not blogging about this seven years after the event to show how good I am, but rather to show that old data can (with a bit of luck) be recovered.

But to simplify your task do the following:


  • try and find if there's anyone still left who remembers the days of tapes - hopefully they might be able to help interpret the (paper) labels stuck on the tapes.
  • talk to the people who are going to read the tapes. Chances are the tape will be in a 9 track format and be ascii encoded, unless it came from somewhere that used IBM or Amdahl mainframes where it might be EBCDIC
  • don't be put off by people mentioning dead manufacturers like Prime or Data General, 9 track was a fairly standard format
  • When you finally get to read the data, remember that even though it's been recorded in a standard way it doesn't mean that the data isn't in some proprietary format - again if you can find someone who knew about the original data they might remember the name of the software package used
do some detective work, and chances are you might luck out, and don't be afraid to ask questions ...


Sunday, 28 April 2019

Espresso book machines revisited

A long time ago, over ten years ago in fact, I became quite excited about the espresso book machine.

At the time it seemed to offer the promise of small run book publishers, such as your typical small university press, the opportunity to avoid the costs of printing and holding stock, as well as the potential to on demand reprints of out of print books.

Well, ten years on, the landscape hasn't quite changed as I imagined it. Yes, there are various printers, mostly in India, who will do a cheap reprint of an out of print nineteenth century book, by printing a copy of a scanned edition downloaded from the internet archive, something for which you basically need a laptop, an internet connection, and a laser printer, and access to the equipment required to bind a book, which in a low cost country such as India, where labour is cheap and there is a well established book printing industry, it's probably cost effective to have a semi manual process.

But recently I've bought a couple of scholarly short run Australian books. Even though they were ordered through Amazon Australia's marketplace, due to the mysteries of the book trade, they came from online booksellers in the UK, and they had the look and feel of a print on demand book.

Strangely the front matter that contains the copyright statement and the NLA cataloguing in publication data, didn't list a printer, but at the back of the book there was a QR code and the text Lightning Source Milton Keynes, followed by what was obviously a reference number of some kind.

Being curious I turned to Google to discover that Lightning Source have a pretty informative wikipedia page,

Basically Lightning Source is an offshoot of the same company that developed the espresso machine and provides a print on demand service to small publishing houses - just as I thought would happen all these years ago - and what's more the espresso book machine is most decidedly not dead ...

Sunday, 14 April 2019

Not another bloody thinkpad ...

I've recently blogged about how I finally got around to getting myself a new larger screen laptop to replace my old Dell Inspiron, and of course I bought myself an old Thinkpad around about a year ago, which did a stellar job of replacing my official HP Probook when I dropped coffee on it.

Well I've been so impressed by both of my Lenovo machines I've gone and bought myself a Lenovo Thinkpad Yoga 11E, one of the old touch screen models you can use as a bulky tablet.

Windows 10, 128GB SSD and 4GB RAM, and a reasonably specified processor - all for around $200. I even get 3 months warranty from the refurbisher.

So a bargain, and quite a rational purchase.

I'll explain why:

To get the most out of my old Thinkpad I really should upgrade it to Windows 10, and guess what, the upgrade cost is near enough what I just paid for the Yoga. Now if that was the only consideration I'd probably just have bought the upgrade, but I've two other pressure points:


  • My Chromebook has gone end of life - no more updates, and gradually things will cease to work. At what point it becomes unusable is unknown but what's clear is that the replacement cost will be around $400. One of my major uses of my Chromebook is reading my email and rss feeds in bed - the Yoga with it's touch screen etc is a more than decent replacement
  • My MacBook Air (a 2012 machine) is probably going to drop off the OS X supported device  list sometime soon. On top of that it could probably do with a new battery - it used to manage a couple of hours between charges, it's now managing barely an hour. A new third party battery replacement kit is around $150 if you fit it yourself, or a bit over $200 if you have a repair shop do it for you. The Yoga is heavier than the air and little bit bulkier, but could feasibly make a decent travel computer, and being roughly the same form factor as the Air will fit in both the travel backpacks I own.
So, at the moment, I seem to own a stupid number of computers. However, the old 2008 vintage iMac I use when working with old documents is showing its age, it's already unsupported as regards MacOS and I expect that Google will soon stop supporting Chrome for that version of the operating system, and it will eventually fade away. 

The Air will obviously last a little longer, but one can see the writing on the wall, as one can with the Chromebook. I expect to keep on using my old unupgraded Thinkpad X230 for another couple of years at least.

The Yoga, being ruggedised for educational use, should last as long, and survive trains planes and car trips reasonably well. It also has a decent thinkpad style keyboard to type on (as good as the X230's) which adds to its attractiveness, so I reckon at $200 it's a bargain, and while $200 is a reasonable amount of money, it's not much more than a night in a decent city centre hotel ...

Thursday, 4 April 2019

Coffee 0 HP Probook 1

As I'm sure you're all aware, about six weeks ago I was stupid enough to pour coffee over my work laptop.

Well, it went off to the repair shop, and obviously my prompt if panicked reaction saved the day.

It was stripped down, cleaned up in an isopropyl alcohol bath. The processor daughter card was damaged, but that was replaced with a refurbished spare - tracking one down was the reason it took six weeks to repair my laptop, and it's back, almost as good as new.

All the data has survived, not that it wasn't backed up. The only problems are that it seemed to have lost its network configuration data - hardly a problem really, and the SSID was tied to the processor, so naturally excel whinges that it hasn't been properly activated, again something that just requires the contacting corporate IT dance .

Resyncing the data back wasn't a problem either, all I needed to do was download the data from OneDrive to cover the missing days and open OneNote, and tell it to do a sync. Fifteen minutes work at most.

Obviously before I say it's really fixed I need to use it for a few days, rather than a quick click around but everything looks great.

Oh, and if you're worried that you might be at risk of spilling something on your laptop, check out this sensible advice from the NYT...

[The original title of this post was 'Coffee 0 HP Powerbook 1' - complete brain snap on my part, the laptop in question is a ProBook - a 6470b to be exact]

Sunday, 31 March 2019

Power outages and documentation

As a rider to my use of coffee to prove a documentation methodolgy, we had another proof of the scheme's robustness a couple of days ago.

Under the scheme, data is saved twice, once to the computer's local drive and secondly to a USB stick. The data on the computer's local drive is also backed up to OneDrive, and entrusted to Microsoft to look after.

The crucial point is that you don't need a functioning internet connection to carry out documentation - as long as you have access to one somewhere in the piece to back the data up everything is fine as you always have at least two copies of the data - very useful as I found in the coffee pouring incident as I was able to check and confirm that all the data had been backed up.

This time it was the power company. The power went off with an unscheduled outage, and more importantly stayed off. However as I had (conservatively) about three and a half hours of battery life left on my computer and the same on my phone - I use my phone to take pictures of the artefacts and transfer the data to my computer. Normally I recharge my phone as I go from my laptop, but obviously I didn't do that once the power went off - a severe case of robbing Peter to pay Paul.

So, with three and a half hours worth of power I could stay working.

Which I did - the only limiting factor was that it began to cloud over in the early afternoon, and the light began to go, making it difficult to work.

Once home, I powered up my laptop, let it sync to OneDrive, and hey presto, we were done and backed up...

Saturday, 30 March 2019

Ok, finally got myself a new computer

Well,

about a month ago I finally got round to buying myself a new computer.

Lenovo had a special offer on their AMD Ryzen systems where you got a 512GB SSD for the cost of the standard 256GB unit, and the one thing I'm hungry for is storage.

So I went for it.

Of course as it was a special build to order configuration I had to be patient and wait for it but it eventually arrived yesterday.

Out of the box it just worked. I can't say I took to the slightly shouty voice enabled activation assistant, but, but it all just worked.

And once it was configured, all I needed to do was add the tools I use, much as I last year with my old thinkpad.

Speed to set up, download and configure were impressive, and while the keyboard wouldn't be my first choice (I prefer older clacky ones), it's pretty nice to type on.

The only annoyance was that to install Dropbox, I had to unlink some of my older machines, as Dropbox now limits free accounts to three clients, but then there's also sendtodropbox.com for use with older machines, and I guess I could start using Box more ...

[update 31/03/2019]

which indeed I've done. I've added the box client to my new computer and to my ipad (on which I'd never got round to installing dropbox) - and we'll see how this goes ...

The use case is of course slightly different - when dropbox, box, and the rest first came on the scene there was little in the way of cloud based storage, and sharing files between machines essentially meant copying them between machines.

Dropbox like services' unique proposition was that the files were always in sync providing you had a working connection.

Things of course are different these days. Be it OneDrive, Google drive or Amazon's services there are lots of way to both share files between machines and ensure that they stored securely. For exampl, if I'm working in a library somewhere with my ipad, I can easily save the notes I've written by sending them to OneDrive from pages, or indeed saving them to icloud.

What Dropbox (and the rest) now have as their unique proposition is  now 'save once, sync everywhere' without people having to go looking for the latest version.

Given the chaos I've seen with shared editing of funding proposals, that's a pretty powerful proposition for a group, but for an individual, especially as the first tier up costs the same as any other storage solution - say A$15 a month for a terabyte - perhaps less so.

As I said, we'll see how this goes ...