Monday, 29 July 2019

capturing a tweet thread ...

Every day I run a set of automated google alert searches on topics that I'm interested in - Greek and Roman Archaeology, medieval history. Egyptology and a couple of others.

A few years ago these would regularly pick up something interesting on someone's research blog, and I would start following their RSS feed, and I'd also quite often clip and store interesting material into one of my pack rat notebooks on Evernote or OneNote.

Well, it's 2019, and people don't write blogs anything like as much, but interesting things are still happening out there, but quite often what's happening that's cool is published as a twitter thread, and not as a blog post:

For example I've just read a fascinating thread on bringing Ancient Egyptian yeast back to life, which pushed my buttons in so many ways.

But it's a thread and that's a problem - how do you capture the thread to save to an online notebook, or indeed print offline to read on the train ?

Well I've found and used two solutions

Spooler - https://tinysubversions.com/spooler/

and

Threadreaderapp = https://threadreaderapp.com/

Both work more or less the same, and both produce threads that can be saved to OneNote with the OneNote web clipper, or printed to a pdf.

One little gotcha is that if you have an image heavy thread you need to check that all the images are loaded before either saving to OneNote/Evernote or printing to pdf, otherwise you end up with a pile of blank rectangles where the pictures should be. OneNote's preview function is useful here for checking that your clip contains what you really want.

The major difference between the two is that threadreader doesn't force you to login with your twitter account to use the application while spooler does.

Spooler wants the url of the last tweet in the thread, Threadreader wants the first - all in all Threadreader feels a little better supported and a little more sophisticated, but that's about it - it does offer some options to save and download your threads if you login, but you don't need to.

Both do the job so it's really a coin toss as to which to use ...




Sunday, 7 July 2019

I nearly bought a windows phone ...

which seems to be a very silly thing to do, given that they've gone end of life.

But I thought I had a reason - overseas travel.

For the last four or so years we've used an old Nokia Asha 302, and while it's done excellently as a travel phone, long battery life, good for texting hotels and taxis, it's clearly reached the end of its life.

Increasingly one needs to have something that runs apps for Uber, Grab, some local service you've not heard of yet etc etc.

And that's the rub.

With the windows phone going end of life, you can guarantee that increasingly there won't be a windows phone version of that crucial travel app.

Which is a shame, because (a) you can get a pretty well specified phone for under a hundred bucks, and (b) you don't need to tie it to your Google or Apple account.

But as I said, the need for access to a mainstream software platform kills that dead, so I guess it's a cheap no name android phone and a dummy google account ...

Thursday, 4 July 2019

Digitising magnetic tapes - in house or outsource

Earlier today I posted the following on twitter as part of a conversation as to whether it was better to out source the digitisation of several hundred cassette tapes:


The answer is more complicated than twitter allows, so I though I'd expand it.

Cassette tapes were phenomenally popular during the roughly thirty year life of the tape cassette as a mainstream format. Not only were they used for student party tapes but were extensively used to record court transactions, music, including performances by non mainstream performers, and spoken language. So not surprisingly they form a huge resource for linguists, anthropologists and the rest.

Not only were cassette recorders cheap, the media was also cheap and universally available, be it in rural Turkey or Morocco or in high street discount stores.

The tapes did fail and jam in players, which is why no roadside was complete without a sprinkling of dead cassettes and flickering strands of cassette tape. The fact this is no longer the case is because they're not used anymore - most informal and non professional recordings are on USB sticks these days.

When we visited Sri Lanka six years ago all the drivers we had were already using USB sticks to play pirated Indian and Korean pop music.

This leads to a problem - no one much makes cassette decks anymore, and equally no one makes cassettes in volume, and more importantly these handy little kits you got to unjam, rewind and generally repair broken cassettes.

Searching on ebay for 'blank cassette tapes' does bring up a range of choices, but they're expensive, and certainly not the cheap universal medium they once were. Likewise, it's still possible to buy cassette players, the more expensive professional equipment can be difficult to track down.

So, the the first question is do you have the kit to record the data.

Tape cassettes are of course analogue, but you  can copy a cassette's content to digital media by connecting a cassette player's output socket to the microphone input socket on a pc and using some suitable software to capture the input and perform the analogue to digital conversion. You can buy devices that claim to do the conversion for you, but I've no experience of how well, or badly they perform.

However, doing a simple direct conversion  is probably fine for a few tapes, At a little over an hour for a C60 or and hour and a half for a C90 tape, it will be tedious, but possible. At least you'll have plenty of time to transcribe the label and any other information that comes with the tape.

The problem or course is that your tape player will most probably be at least ten years old, and the tapes will be equally old, and you need to have a plan B, or at least a spare tape player in case of equipment failure - remember the more tapes you have the more likely your old tape player will fail.

Equally, the more tapes you have the more likely tape failure becomes, and you need to have a plan to repair cassettes which break and jam, and you need people with the skills to repair damaged cassettes.

There used to be such things as high speed tape duplication machines which basically ran the cassette through eight or sixteen times as fast, and while you could conceivably use one of these to speed up the digitisation process, but remember that old tapes are more likely to fail and break due to being stressed by being played at high speed.

And this of course means that you really do need to have access to someone who works with the media and can repair both the devices and the tapes.

One place I worked, we had a project to recover and preserve culturally significant tape recordings and we had a couple of people whose job was basically to scour ebay for spares, maintain old tape recorders, and if necessary repair old broken decayed tapes.

That expertise is hard to find - you basically need to find and employ some old school sound engineers who have worked with a range of equipment and still have all their old skills.

That project was now over ten years ago, so it's important to remember as time goes on these skills are harder and harder to find as increasingly all the old school sound engineers and tape technicians are out of the workforce enjoying a well earned retirement.

So, it can be done in house, and if you are already set up to digitise analogue tapes it is a fairly straightforward, if tedious, exercise. Likewise if it's only a few tapes, and they're not critically important you could probably track down a decent quality cassette deck in working order and do it yourself - it's simply a decision as to whether outsourcing is cheaper than doing it in house.

If you've a lot, and the contents are valuable, I'd certainly seriously consider employing a specialist external company to do the work ...


Thursday, 20 June 2019

University news pages

As any fule kno I probably spend more time than I ought to retweeting links to interesting stories - principally though not exclusively ones based on classical and early medieval history.

I actually started doing this years ago purely for my own benefit - in the days before pocket - as a way of saving the url's of articles I wanted to read later. Oddly, some people found what I was tweeting interesting, and started following me, so even though pocket is now a feature of the information landscape I've kept on tweeting.

But sometimes I find an article that is sketchy and unsatisfactory in some way and I try and track down a better version, again really for my own benefit, but if someone finds it useful, well why not ?

If the article refers to a specific researcher at a university I usually try searching that university's news pages as that is where I kind of expect the original press release to be.

Except sometimes it's not.

Sometimes a university's news site is more about how well the rugby team did, or what the vice chancellor had for lunch than the actual outcomes of research, and even more worrying, sometimes all the news is hidden behind scads of marketing information aimed at attracting students (and bring their fee money of course) at a particular university.

And while research ratings are important, they're only one part of the university ranking game, and some university marketing/press departments  seem to be more interested in marketing than communicating.

I promise not to rant on about the actual irrelevance of  university rankings to student outcomes, but given that much university research, especially in the humanities, is funded with public money, I would have thought that communicating the results of the publicly funded research was an important part of the function of university press offices, rather than inviting people. the public, who paid for the work, to have to play a game of guess the url to find the university's research news site ...

(... and of course this has to be done manually, surprisingly a lot of institutions no longer provide an RSS feed)

Wednesday, 22 May 2019

The joy of bibtex ...

The project's been chugging along nicely, and I've nearly finished documenting the dispensary and the back shop - we originally thought that there would be around 4000 items in total, but I've already documented around three and a half thousand, and there's still the shop to do.

Recently, one of the groups of items documented was a set of reference books - pharmocopaeias mainly, the earliest from 1914, the latest from 1963.

Too early to have ISBN's, and some different editions of the same pharmacopaeia.

So, how to document them and provide a unique reference, and preferably one that was machine readable?

BibTex!

All the books, and the correct editions, were on the National Library of Australia's catalogue which provdes a handy download of the BibTex reference, which gives us a professionally compiled description of the item, plus a catalogue reference to the NLA's catalogue to allow someone in the future to do a simple double check.

The one exception was a book which I couldn't find in the NLA, or any of the state libraries in Australia, but did find in the British Library, which unfortunately doesn't provide a handy citation export in BibTex format.

I could, I suppose, have downloaded the citation in the BL's preferred format and run it through one of the Endnote to BibTex, or Marc to BibTex conversion tools. but as it was only one entry, downloading, installling, and then checking the output seemed almost as much work as creating an entry by hand, so I ended up hand creating an entry based on the BL's RIS output.

And why BibTex?

Two reasons: (1) it's a common well documented format and (2) as well as being machine readable, its also human readable - more or less - which makes it easy for any future researcher or archivist using the data I've created to be sure that it was this edition and not that edition ...

Wednesday, 8 May 2019

recovering data from garages

Earlier today a former colleague retweeted this:


and strangely, I've been here before.

When I was managing the ANU's various ANDS funded data capture projects we made use of company in Perth that specialised in reading old tapes - in particular for the mining industry, but they would read anything - for a fee of course.

As part of the DC7A project, we used this company to read seismological data that was locked away on piles of DAT tapes that no one could read any more, due to no one on campus having suitable hardware.

As is the way in universities,  a researcher in social sciences, who worked in PNG heard of this.

He'd recently found some old 9 track tapes in a colleague's garage, and he recognised them as likely to hold a copy of some data from the PNG government. More importantly he thought that it might be data that the PNG government had lost as a result of a hardware failure.

Details were sketchy, there were some paper labels that identified them as 9track ascii tapes, but that was about it.

Any way I talked to our data recovery company and they were happy to give it a go.

Fortunately, despite languishing in a Canberra garage, the tapes were readable and were in a straight forward comma delimited format, rather than some old proprietary compressed data format, or some strange format used by some now forgotten data manipulation software.

So the data was recoverable and could be returned to the PNG government.

Now I'm not blogging about this seven years after the event to show how good I am, but rather to show that old data can (with a bit of luck) be recovered.

But to simplify your task do the following:


  • try and find if there's anyone still left who remembers the days of tapes - hopefully they might be able to help interpret the (paper) labels stuck on the tapes.
  • talk to the people who are going to read the tapes. Chances are the tape will be in a 9 track format and be ascii encoded, unless it came from somewhere that used IBM or Amdahl mainframes where it might be EBCDIC
  • don't be put off by people mentioning dead manufacturers like Prime or Data General, 9 track was a fairly standard format
  • When you finally get to read the data, remember that even though it's been recorded in a standard way it doesn't mean that the data isn't in some proprietary format - again if you can find someone who knew about the original data they might remember the name of the software package used
do some detective work, and chances are you might luck out, and don't be afraid to ask questions ...


Sunday, 28 April 2019

Espresso book machines revisited

A long time ago, over ten years ago in fact, I became quite excited about the espresso book machine.

At the time it seemed to offer the promise of small run book publishers, such as your typical small university press, the opportunity to avoid the costs of printing and holding stock, as well as the potential to on demand reprints of out of print books.

Well, ten years on, the landscape hasn't quite changed as I imagined it. Yes, there are various printers, mostly in India, who will do a cheap reprint of an out of print nineteenth century book, by printing a copy of a scanned edition downloaded from the internet archive, something for which you basically need a laptop, an internet connection, and a laser printer, and access to the equipment required to bind a book, which in a low cost country such as India, where labour is cheap and there is a well established book printing industry, it's probably cost effective to have a semi manual process.

But recently I've bought a couple of scholarly short run Australian books. Even though they were ordered through Amazon Australia's marketplace, due to the mysteries of the book trade, they came from online booksellers in the UK, and they had the look and feel of a print on demand book.

Strangely the front matter that contains the copyright statement and the NLA cataloguing in publication data, didn't list a printer, but at the back of the book there was a QR code and the text Lightning Source Milton Keynes, followed by what was obviously a reference number of some kind.

Being curious I turned to Google to discover that Lightning Source have a pretty informative wikipedia page,

Basically Lightning Source is an offshoot of the same company that developed the espresso machine and provides a print on demand service to small publishing houses - just as I thought would happen all these years ago - and what's more the espresso book machine is most decidedly not dead ...