Tuesday, 29 April 2014

The XP apocalypse

When Microsoft discontinued support for XP there were all sorts of dire predictions that the sky would fall.

Most people running XP ignored these - for a lot of reasons, probably quite a few centred around inertia and complacency - after all they had some anti virus software, everything was working well and met their needs, and well upgrading costs money, not just for the OS but to replace these legacy application that would no longer work with 7 or 8.

If this was true of small business users, who had XP in their cashtills and god knows where else, it was doubly true of a large number of home users, including that often ignored constituency of older users who still have a single pc at home.

After all the kids have left home, the pc lets them skype the kids, do their internet banking, online shopping, home accounts and send the odd email - everyone has people like this in their family - they’re not luddites, they just don’t see the need beyond the half dozen things that they do.

Well now we’ve got a problem with Internet Explorer. The first thing to do with these people is give them Firefox or Chrome - both well supported and both browsers that banks and the major online shopping sites trust.

The second thing to do is explain to them that this is something that will keep happening - XP will not die with a bang but a whimper.

The third thing to do is to give them a live CD of a long term support (LTS) version of Ubuntu. Change the boot priorities on their PC so that if they have the live CD in the drive they’ll boot from that by preference (these are the sort of people who shut their pc’s down after using them, so you’ll probably need to explain about putting the PC in the drive and restarting.)

Once you’ve done that - show them Ubuntu - show them libre office, show them firefox, and because they probably still use a mail client rather than web mail show them evolution. As they probably have an older PC you can probably pretty much guarantee that they won’t have problems with hardware or firmware compatibility.

Then leave them the CD and suggest they explore some more - some will, some won’t, but it’s important to give them the option. The other thing is that they will talk to their friends about this, and whether they’d be better off with a new PC.

Windows 8 has been a disaster in marketing terms, and wildly unpopular with this sort of user because it’s just so different in terms of user experience. I have a cynical suspicion that the introduction of windows 8 pushed up sales of Macs to the 50+ demographic - after all these nice clean kids in the Apple store show you just how easy it is to change over and move your files.

It’s alo important to appreciate that this demographic doesn’t know what things cost - computers have always been expensive to them so $1500 for a new iMac seems to be just what it costs no matter that you could get a decent Asus all in one with Windows 8 for not much more than half that from one of these discount stores with bright lights and annoying just post pubescent sales people (and this demographic likes all in ones - for years they’ve fought with where to put the system box in your classic three box combo).

So suppose they decide to try Ubuntu. Basically you need to come back and help them install it. It may be easy but it’s not something they’ve ever done before and they’re worried about stuffing it up.

First of all back up their files. Burn them to CD or DVD. Then install Ubuntu - the aim here of course is to get rid of XP so you are going to wipe the disk and start over, none of your dual boot stuff here.

Then set up their email and give the back their files. Show them that their email works, that they can read their documents in Libre Office, and view their photos. Give them the backup DVD and tell them to keep it safe.

Now this is fairly tedious but think of it as payback for the times they helped you change the oil in your car when you were a penniless student. Do it right and you probably won’t get too many phone calls asking for help.

Also, don’t be surprised if a couple of months later you find they’ve bought a new computer. Changing things has made them think about upgrading. Normally they only replace a computer when it breaks. And that backup DVD will come in really useful for moving their files to their new computer.

Either way, they’re off XP onto something supported, which probably means they’ll avoid a real disaster …

Written with StackEdit.

Tuesday, 22 April 2014

Handwriting recognition ...

Google engineers working on extracting house numbers from street view images have recently got their algorithms to the point where they can beat Captchas.

House numbers are of course a good case as there’s no real rules in most places about font, orientation and so on. And of course character shapes can be somewhat distorted in the images.

At the same time engineers at Evernote have blogged about indexing handwritten images - a seemingly dry topic, but of course if you have a bunch of scanned handwritten notes you need to do some text recognition and extraction in order to identify meaningful text in order to index it.

Putting the two posts together comes up with some intriguing possibilities. There’s some agreement that the best way to recognise handwriting is to decompose it to the individual letters and do shae recognition - even if like me, you have handwritting distorted by years note scribbling, the letter shapes you use are fairly consistent - and being able to cope with distortion better will find the shapes more easily. At the same time Evernote’s approach is interesting as it effectively does look ups of sequences of letters to guess word boundaries to work out which letters belong together.

Putting the two together not only do we have the potential of better handwriting OCR, but also the possibility of doing text extraction from hard-to-read handwritten documents - and here I’m thinking about documents such as medieval tax rolls which have not usually been digitised or transcribed.

Given that both Evernote and Google contribute to a range of open source projects there’s more than a chance that we’ll see the technologies becoming available reasonably soon ….

Written with StackEdit.

Ok, a replacement chromebook

As I’ve already written I finally got a refund last week for my faulty chromebook, and then went out and replaced it.

Obviously not with another of the faulty model, but with the 11” HP model for two reasons

  1. it fitted the bag I’d bought for my previous device
  2. it was on longterm support from google - not the longest, but not bad

So how is it?

Well having inadvertently found myself in the position of having two devices in rapid succession I can not only compare it to the classic netbook experience, but to earlier chromebook designs, which were basically netbooks with CrOS.

The HP model uses a RISC processor which means that it runs cooler - meaning (a) no cooling vents on the base of the unit and (b) you can comfortably sit with it on your lap. The RISC processor also means better battery life, important if you are using it for note taking etc.

The device is pretty thin - no bulky ethernet port or cooling fan internally makes for a thin case. Ports are pretty meager, headphones and a couple of USB ports, but again probably all you need.

The keyboard is a standard chiclet type. Compared to my MSI netwook the keys are a little more squashed up against each other but you can type reasonably quickly. In true Chromebook style there’s no PgUp or PgUp keys.

I found the trackpad just a little too assertive and responsive out of the box - I think this is an HP thing - I felt the same about J’s HP Windows laptop, but as in all things familiarity and use helps.

For those of us who care about such matters, connecting to eduroam was straighforward (see this guide from Sheffield Uni in the UK for example instructions).

All in all I’d describe the HP Chromebook as a tablet with a keyboard, rather than a netbook with ChromeOS.

The next (real) test will be to take it travelling and see how it compares on the road with a windows netbook …

Written with StackEdit.

Thursday, 17 April 2014

New Chromebook

Well I finally got a refund for my faulty Chromebook. It was a struggle but I won in the end.

To celebrate I went out and got myself a replacement Chromebook, one manufactured by HP this time.

I was incredibly impressed again by the whole statelessness thing - I literally plugged it in, waited twenty or so minutes to make sure it had a decent amount of charge, turned it on, configured the network - and then hey presto! it was all there, even down to the wallpaper I’d set on the desktop on my last(faulty) unit so I’d know if they shipped it back to me without fiddling with it.

Thinking about things and comparing things with my laptop it’s the statelessness factor that sells the whole Chromebook model.

Strangely, the same thing goes for Roll.app - your files are saved elsewhere on Dropbox or Google Drive, and are hence accessible from anywhere, from which ever device and without having to have any particular software installed other than a reasonably recent browser.

If you work in various places as I do, and swap between machines and platforms, it’s somthing that is incredibly powerful …

Written with StackEdit.

Tuesday, 15 April 2014

Data Librarians or eresearch analysts

There’s a quasi meme going around at the moment that as libraries increasingly become portals for online resources and scholarly publication (and data publication) moves online that librarians will morph into data librarians.

You can see the logic in this one. Librarians know about citation, they know about discovery, and they know about access to electronic resources, so it would be very stupid to say that they don’t have a role to play.

What, of course they tend not to know about is sordid stuff like storage architectures, migration strategies, file formats, backup and the rest. That’s the province of information technology, and equally it would be stupid not to say that they have a role, and indeed with harvesting and handling of data , a greater role.

So why does the meme focus on librarians and not IT geeks?

Well there could be a variety of reasons.

Twenty of thirty years ago IT people were generalists - there was little formal training other than some vendor specific courses, and most people taught themselves. There was also a plethora of competing solutions, so a lot of time was spent giving advice and helping researchers choose suitable hardware and software. (yes, there were always some men (and they were usually men) who wore suits and talked about COBOL and made sure everyone got paid on the right day, but we’re not talking about them, even though they are still with us, except these days it’s Cognos)

If you had looked at any university computer centre in the nieties you would have found people in a support/advisory/enabling role. Some may have been old fashioned applications programmers who learned new tricks, and some may have been people like me who’d started out doing something else but who’d ended up in support because they were good at it.

Nowadays these people are a rarity. The older ones have retired, and there’s no clear replacement cohort.

Why? because in the nineties IT changed and it became, in the main a microsoft based mono culture, highly technical and looking for skills in complex products such as Active Directory, Exchange, Sharepoint and the rest, with the result that the culture changed - you can see this in any mixed windows and unix shop - the unix staff tend to be older, sometimes scruffier, and are more diverse, and have odd hobbies. The Windows staff tend to be younger, more focused, and just abit more corporate in their manner.

This isn’t universal, but I’m sure that if you know your data centres you’ll recognise the stereotypes.

The consequece of the change was a hollowing out of IT to concentrate on service delivery rather than support - something which brought efficiencies and may have made achieving kpi’s easier, but along the way engagement was lost between IT staff and researchers - and this enegagement has been lost for long enough now that people have forgotten that it existed.

So, as a consequence, the focus is on librarians becoming data specialists, in the main because they are perceievd in beineg more engaged with the research faculty. However this does neglect the need for quite a lot of formal IT technical knowledge to provide informed advice and discuss options - perhaps what we need is is a second meme on building technical engagement between IT staff and researchers …

Written with StackEdit.

Retext on Roll.app

Back in January I played with Libre office on Roll.app which of course allows the editing of native odt documents on a Chromebook on those occasions when the GoogleDocs import does not quite work.

You can also use the linux markdown editor Retext on Roll.app - here the use case is less clear but given the availability of StackEdit in the Chrome environment, but it does provide an easy means of generating a pdf to share or indeed email off to some other service such as evernote.

Running Apache Tika over a pdf generated from the online version of Retext shows it to be an incredibly standard implementation:

Content-Length: 17202
Content-Type: application/pdf
Creation-Date: 2014-04-14T15:59:03Z
created: Tue Apr 15 01:59:03 EST 2014
date: 2014-04-14T15:59:03Z
dc:title: New document
dcterms:created: 2014-04-14T15:59:03Z
meta:creation-date: 2014-04-14T15:59:03Z
producer: Qt 4.8.1 (C) 2011 Nokia Corporation and/or its subsidiary(-ies)
resourceName: richard_carew.pdf
title: New document
xmp:CreatorTool: ReText 3.1.3
xmpTPg:NPages: 1

(doing the same thing with the original markdown file is utterly uniformative due to markdown's lack of embedded metadata:

Content-Encoding: ISO-8859-1
Content-Length: 197
Content-Type: text/plain; charset=ISO-8859-1

resourceName: richard_carew.mkd

which is of course what makes markdown so portable)

Application performance could have been faster, but given that the servers are topologically a long way from where I am quite acceptable for typing, something to bear in mind if using the service for field work to type up a bunch of notes.

All in all I find the Roll.app business model quite interesting - provide an execution environment, show people Macca's ads before starting the app, and have people connect their dropbox and google drive accounts to save providing storage - something that saves them the trouble of providing storage, and yet, unlike some other freemium services, lets them garner some revenue from casual users ...

Monday, 14 April 2014

Nixnote 1.5

I'm an extensive evernote user, and I've played with NixNote the open source Linux client over the years. My use case is that I'm most definitely a linux user as well as an evernote user, meaning that I need to have evernote access from Linux. Until now that has meant the web version - neither nixnote or everpad (another open source client) have delivered

Of the two, Nixnote was clearly the front runner, but even so, I'd describe NixNote up to now as promising - that's changed with version 1.5 which is stable, usable, responsive.

Synchronisation is still a little slow, but overall performance is as good as the official client on both Windows and the Mac and searching for notes faster than using the web version.

Nixnote is also availabe for Windows and OS X but I've never tested these version and can't comment. However, if you're a serious Evernote user with linux inclinations, I'd suggest giving the current version of NixNote a go ...


Friday, 4 April 2014

Big archives and data liberation ...

There is a lot of chatter around big data and how it is going to save the world.

Well it is true that some of these incredibly large datasets have value outside of allowing supermarkets to know which brand of cat food you, or rather your cat, prefers, they also tend to obscure another aspect of the data revolution.

Rather than just very large datasets it is the volume and diversity of data sets that have been put online as a result of various digitisation projects.

For example, I have found online accounts and pictures of a project to help orphaned Basque refugee children from the Spanish civil war that my mother worked for, and the details and records for the sinking of the ship my father was second engineer on during the retreat from Singapore.

It is not that these things were unknown - but even 10 years ago to find the records would have involved writing to archivists, a trip or two to the archives themselves, requests for material to be copied, etc. It would have taken a lot of time, and given that I live 20,000 km from some of the archives in question, damn near impossible.

Today I can find most things from my desk, starting with google and perhaps a few likely resources. Just as I did to find how news of Linclon’s assasination reached Australia

Such is the power of online that material that isn’t online effectively does not exist - which is undoubtedly not good, as it means that non digistised sources will tend to be ignored, skewing scholarship, but we also need to recognise the power of online access to resources and how they empower people to not only ask questions but find answers, even if the questions they have are about early medieval cats, rather than something profound.

In 1984 George Orwell came up with the memorable line
He who controls the past controls the future. He who controls the present controls the past.
Scarily true when one looks at the rewriting of history to justfy current actions. By making digitising resources and making them open and available, it reduces the risk of manipulation, something which also goes for research data as well ...