Stuff, geeky stuff: 02/01/2014

Monday 17 February 2014

Canaima

Following on from my look at Huayra, I thought I'd take a look at Canaima, another Latin American linux distribution that has some push behind it.

Canaima is a Venezuelan government sponsored Linux distribution. They've recently announced a technology agreement with Huayra, which is why I thought I'd take a look.

Canaima turns out to be just another Debian GNU/Linux distro, with a few localised applications for email and web access but otherwise, utterly standard.

All the options exist for an english language installation (and indeed all the other languages supported by Debian) Under virtualbox it complained that it lacked the correct graphics drivers to start gnome properly, but running the linux updater fixed that.

The applications software that ship with the distro are more or less what would be expected - Libre Office, Gedit, evince and the rest, the only real departures being a locally customised web browser based on iceweasel, and a locally customised email client.

The value proposition for adopting Linux is of course no licensing fees for applications or operating system, which potentially simplifies managing upgrades and distributions, as well as encouraging a local software industry.

I find the no licencing cost option particularly interesting. Apple have of course recently released mavericks as a no cost upgrade in an effort to standardise the operating system environment across their user base. And when I used to be concerned with managing desktop services, licencing, and the cost of upgrades, was always a major concern.

Interestingly, according to wikipedia, the Venezuelan government has also sought to get vendors to certify their hardware for use with the operating system, which would form the first step in building an ecology around its use as well as providing some assurance to hardware purchasers that the hardware was compatible.

I, of course, have not visited either Argentina or Venezeula, so cannot comment on the actual extent of the adoption of Huayra or Canaima, but these are both interesting initiatives both educationally and technologically ...

Friday 14 February 2014

Huayra ...

There’s been a lot of chatter about the adoption of open source in Latin
America, including the adoption of linux based distributions in education.
So I decided to try one.

enter image description here

I chose Huayra, a distribution sponsored by the Argentine Government.

Huayra arose out of an Argentine government initiative to distribute netbooks to all secondary school students in Argentina. Huayra was designed to provide a common standard operating system across all computers to standardise the provision of training and support.

On a more personal note, I chose a Spanish language distribution over a Portuguese language based distribution from Brazil for the simple reason that my sketchy scratchy
peninsular Spanish is way better than my near non existent Portuguese.

The Spanish version of Wikipedia describes Huayra as:

Huayra GNU/Linux es un sistema operativo desarrollado por el Programa
Conectar Igualdad de Argentina. Huayra está basado en GNU/Linux, es una
metadistribución de Debian. Liberado con la Licencia GNU/GPL, Huayra es
software libre.
Su desarrollo está a cargo del Centro Nacional de Investigación y Desarrollo
de Tecnologías Libres (CENITAL) de Argentina, dependiente de la ANSES y toma
en consideración las necesidades tanto de estudiantes como de docentes y
otros actores involucrados en el programa Conectar Igualdad.
Huayra toma su nombre del vocablo Quechua que significa Viento.

Which Google translate renders as

Huayra GNU / Linux is an operating system developed by the Equal Connect
Program Argentina. Huayra is based on GNU / Linux, is a meta-distribution of
Debian. Released with the GNU / GPL license, Huayra is free software. Its
development is undertaken by the National Centre for Research and
Development of Free Technologies (ZENITH) of Argentina, under the anses and
takes into consideration the needs of both students and teachers and others
involved in the Equal Connect program. Huayra is named after the Quechua
word that means wind

It’s available as a 2.2GB live CD download. The download was not particularly
fast but that’s only to be expected given the network topology between Australia
and Argentina.

The live CD booted reasonably fast under Virtualbox and had all the usual
educational tools - nothing more and nothing less. As you’d expect everything
worked - it did exactly what it said. The colour scheme might not have been what
I would have chosen, but that was my only criticism.

However it was a little slow - so I decided to install it to Virtual Box to
make a proper virtual machine to give it a fairer trial. Installation was fairly
unambiguously labelled Instalar en tu disco which was fairly obvious.

Installation followed the fairly standard GNU/Linux pattern - select the
keyboard, work out which software modules required and download any not present.

And that’s where the wheels came off. I was intending this to be a full review
with screenshots but during the installation process Huayra decided it needed to
download some wireless drivers, which given it was doing it from a topologically
remote repository stopped the installation dead - not Huayra’s fault of course -
there was no reason for the designers to expect that anyone from outside of
Argentina would try installing it.

However it did stymie my giving it a fair trial - however even the live cd
version showed its potential as a way of providing a comprehensive software
environment with little or no licencing overhead - something ideal for giving
students access to the internet, especially where they do not necessarily have
access to the latest hardware, or the budget to pay for a comprehensive
migration to a more recent version of Windows.

Thursday 6 February 2014

No, Google wasn't down yesterday ...

Yesterday afternoon, Google services died on me. Gmail went away, docs went offline. Everything was out, even the apps status page that tells you if there’s an outage.

I immediately tweeted to see if anyone was seeing this, and got back responses whichshowed it was out for everyone on campus, though no one offcampus seemed to be seeing problems.

It was, to quote a colleague, as if the aliens had landed.

Now Google was not out, and was not feeling indisposed. What had happened was more interesting.

A router between here and Googleland had decided it was over forwarding packets to Google and had decided to send them someplace else. For some reason monitoring did not pick up on the error and fail over to a secondary route (or router).

Why this happened is still under investigation.

However this does neatly highlight a problem with outsourcing everything to the cloud. Not only do the cloud providers have to do their thing, the infrastructure also has to - it’s no good if your cloud provider never goes down if you infrastructure does.

Which on a sparsely populated continent on the bottom of the world increasingly means infrastructure and links to overseas data centres.

Bascally, even if we can minimize latency, adjacency still counts, or to put it more simply, using more adjacent services ie those on the same continent reduces the admittedly small risk of the internet going away, if only for a few minutes …

Written with StackEdit.

Wednesday 5 February 2014

And what do we do with the media after we recover the data ?

Last night I had a disturbing dream.
I dreamt that the cleaners had thrown out the boxes of tape media stored in front of my desk. Of course they had done no such thing.

But it raises a question. What do we do with legacy media once we have recovered content?
There is a lot of born digital archival data out there stored on legacy media such as dat tapes, zip drives, five and a quarter inch floppies and the like.
And in the main we can't read them - well not easily. We can either buy and maintain the equipment ourselves, or else employ a specialist recovery company to read the media for us. Typically we send off boxes of tapes and get back a single terabyte drive containing the recovered content. And on the whole we are quite good at creating procedures to ensure that the media is handled correctly, as after all, it contains the only record of the data and has cost time and money to produce.
But after we have read an ingested the contents of the terabyte drive, verified that data is readble, created metadata and the like, what do we then do?
Actually, of course there's two problems, disposing of the recovered media and disposing of the original media.

Dealing with recovered media

When we get recovered data back from the data recovery company we use it comes on either a usb stick or on an external hard drive. The data recovery company keeps a copy of the recovered data for a short period of time just in case the courier's van suffers a mishap. Once we've confirmed we've received and and been able to read the recovered data there's no reason for them to keep the data, although I'm sure they'd be happy to keep an archival copy for us using a service like Glacier if we asked nicely and gave them our credit card number.
Once we've ingested the data into our storage system the drive containing the data becomes pretty redundant. However, as they don't take a lot of physical storage space we might as well keep them just in case. For a year or two at least.
After then we have a choice:

Offer them to the data owner. They might have their own reasons for keeping them
Dispose of the media using a secure data disposal service

In both cases it's important to document what we did, and get some sort of formal receipt from either the disposal company or the custodian. (I was once involved in a problem where. as part of resolving a contractual dispute, a whole pile of emails had to be retrieved from a tape archive, and it turned out that someone had had a little finger trouble and had prematurely wiped some of the archive - records and incident reports were key to showing it had been a genuine stuffup and not malicious)

Dealing with the original media

This is a little more contentious. It's basically useless, and given that you've just spent a few thousand dollars having it read by specialist contractors it's unlikely you are going to repeat the experience.
This is of course assuming that the data has been recovered successfully. If the media was corrupt or otherwise not able to be read successfully you might well want to keep the originals in case someone has a bright idea about how to recover the data.
Legacy media is also bulky to store. A cubic metre's worth of DAT tapes will fit on a single external hard drive. The data owner may not want it back for this reason.
However what we do with it is basically just the same as we do with the recovered data:

Offer them to the data owner. They might have their own reasons for keeping them
Dispose of the media using a secure data disposal service

But of course in both cases document what we have done. That way we have a provenance for the archived copy of the data and a history of what was done and how.

Markdown for archival description

And why bother with Markdown?

There’s a lot of reasons most of which centre around efficiency and simplicity,
but there’s also another one as regards digital archiving.

Often we need to generate descriptive material around items, in short to
document it. One simple example would be to say that a particular archived
dataset contains data recovered from a set of 9 track tapes and this is how the
tapes were read.

We might also wish to include a table listing the tapes and information
contained on the tape labels and some photographs of the tapes themselves.

And what we are doing is creating both a statement of provenance and a link to
the physical manifestation of the data.

Over the last ten or so years much toner has been spent on creating policies
around preferred archival file formats for documents, but when we create these
provenance documents we tend to use standard tools and generate these documents
using Word and other such tools, and ignoring our own advice about format
longevity.

Markdown because it’s text based is ideal. We can transform it to make a pdf, or
a Word or Libre Office document. Yet because it’s text based it can be read with
almost any display tool, and the format is sufficiently simple to be interpreted
without any real knowledge of the syntax of Markdown itself, in other words we
can treat Markdown as simply an enhanced text file.

And of course, in using it as documentation standard we’re not creating a
precedent, GitHub was there before us …

Tuesday 4 February 2014

MarkdowntoPDF

In the course of playing with distraction free editors I happened across MarkdowntoPDF a web based service to convert markdown documents into pdf.

I also couldn’t help noticing that the documents were nicer looking than the ones I generate with a combination of Pandoc and Libre Office, and something about the look of the documents and the fonts used made me suspect that it wasn’t using a conversion via LaTeX or one of its variants.

I don’t know what it is about TeX documents but there’s a certain style, almost a smell, about them that’s strangely distinctive.

So, always curious I fed a pdf document created with MarkdowntoPDF through Apache Tika:

Content-Length: 30235
Content-Type: application/pdf
Creation-Date: 2014-02-04T03:28:46Z
Last-Modified: 2014-02-04T03:28:46Z
Last-Save-Date: 2014-02-04T03:28:46Z
created: Tue Feb 04 14:28:46 EST 2014
date: 2014-02-04T03:28:46Z
dcterms:created: 2014-02-04T03:28:46Z
dcterms:modified: 2014-02-04T03:28:46Z
meta:creation-date: 2014-02-04T03:28:46Z
meta:save-date: 2014-02-04T03:28:46Z
modified: 2014-02-04T03:28:46Z
producer: mPDF 5.6
resourceName: add78ef0
xmpTPg:NPages: 1

And there it was - the document creator was mPDF a tool written in PHP to create PDF output from HTML. (Full information is available via Packagist).

Which also means that the service uses HTML as an intermediate file format rather then TeX or ODT, and also opens up the idea of the multi platform document - write it once, generate a web version for display and provide an easy way to automatically generate a pdf document for printing, emailing out, or archiving in evernote …

Written with StackEdit.

Mavericks

Back in October 2013 Apple released a new, free, upgrade to OS X. Officially OS X 10.9 it was more commonly referred to by its code name Mavericks.

Well, I’ve been in this game long enough to know that you don’t go and upgrade operating systems on day one, especially if you use a lot of third party products.

In my case I’d say that the main products in question were

Microsoft Office
Libre Office
Text Wrangler
Evernote
Chrome
Wunderlist
Virtualbox
Dropbox

So I did nothing. Five or six months later OS X 10.9 had had its first global update and an application - Wunderlist - started complaining that I needed to move to a later OS to run the latest version.

So I reckoned it was time to migrate - my work machine at least. I thought I’d leave the home machine, which is getting on for five years old now for the moment in case of performance problems. (While it’s on 10.6.8, just like my work machine, it has a slower cpu and only 2GB of memory. I reckoned if my work machine was slow or bad after the upgrade, chances are that the old iMac would be doubly so).

So one morning when I had things to do that I could equally do on my Linux laptop I opened up the App store and clicked on the option to upgrade to Mavericks.

The whole download and install process took a couple of hours but was very smooth - a couple of passwords and we were away. At the end of the process it also decided I needed to upgrade the EFi Bios on the laptop and that took an extra ten minutes, but basically the upgrade just worked.

All my key applications stayed working - no tedious extra application updates required.

In use the machine seems a little slower than previously - Chrome and Evernote - both of which used to pause occasionally - seem to pause a little more but general responsiveness is about the same.

Everything else seems to work. I have this bad habit of leaving applications open and never closing anything and I have this feeling that I can’t load things up with quite the abandon I used to, which alone with the Chrome and Evernote pauses might be a suggestion that I would benefit from a little more memory, but the machine is by no means slow in use.

Yhe only thing I found really irritating about the upgrade was that somewhere in the process they changed the direction of scroll on my external mouse, perhaps to make it a little more iPad like - the solution was simple, into settings and unlick natural for the scroll settings.

Otherwise it’s fine. I’m still debating over upgrading the old home machine though given the possible memory requirements …

Written with StackEdit.

Monday 3 February 2014

Distraction free markdown editors

Even since I got interested in using Markdown as a lightweight documentation format I’ve been seeing references to distraction free markdown editors.

Always curious, I thought I’d download one and see what was so special about a distraction free editor.
More or less at random I chose texts.io.

Now I am very old school. I’m happy with kate on linux, textwrangler on a mac, textedit on android, and just about remember a lot of the weird keystrokes to make vi do useful things. I used to use PFE on windows until the lack of updates made it a pain. Since then I’ve never found an editor in windows that I feel totally comfortable with. But that’s more a comment about me rather than anything else.

I’ve also written raw TeX enough times to know that having an editor that does the grunt work of embedding the command sequences for you can be a great help.

I’ll even admit that one of the reasons that I use stackedit to write these blog posts in markdown is to use some of the widgets to do the formatting for me.

So just like the early nineties html editors, a good markdown editor should take the grunt work out of formatting things. This is different from the language awareness of kate or textwrangler which use highlighting to help you trap errors. Basically, hit the Code button and you should be able to mark a block of text as a code fragment.

At its simplest this is what texts.io does. Provides you with a blank screen and some basic formatting tools. This is the distraction free bit, the idea that a fullscreen minamalist user interface helps you focus on what you’re writing - which may be so. If you’re intersted in trying out the distraction free experience without installing texts.io Writebox for the Chrome browser (and by extension a chromebook), and also the iPad allows document format agnostic editing.

So, all the distraction free bit is about maintaining focus, and by using markdown not get too distracted by twiddling the layout of the document.

When I tried it, texts.io provided a pretty good editing experience - not good enough to want me to change but pretty good.

Markdown as a document format is however pretty useless. Not a lot of applications out there can render markdown - which usually means either running the file through pandoc or else using a service like MarkdowntoPDF to make a pdf version of the document.

Personally I usually export my markdown documents to odt and then insert them into a blank template document to both spell check them and make them look professional before emailing them into evernote or onto colleagues.

A nice feature of texts.io is that it provides an easy scripted install of pandoc ( and Xetex for pdf creation) for document export. And I guess that this is really the selling point of texts.io and similar products - texts.io is a paid for product - for a small fee you get a nice editor and one that takes the pain away from installing and using pandoc and Xetex for document export - ie it appifies the process - which might be anaethema to old school toolchain people like me but, on the other hand, why should users have to fiddle about with installing multiple applications and getting them to work together …

Written with StackEdit.

Chromebook woes

Despite being initially please with Acer’s turnround on the repair of my Chromebook, it appears I spoke too soon.

Last Thursday it started crashing again - this time getting itself into a state where it would niether boot or shutdown after a crash.

Of course, I couldn’t get through on the hepline, so I ended up logging a call via the web which is still sitting in the unresolved queue.

However, on the plus side I’m not dependent on the Chromebook, and I like the machine and the Chromebook concept - I’ve just ended up with a bad one, and I’ve every confidence I’ll get it repaired or replaced under warranty …

Written with StackEdit.

Routers and ISP's

I’ve previously written about my fun with a 3G router and our rather sketchy home adsl service.

having got the 3G router working and providing a reasonable alternative, we decided to then change ISP’s to one offering a cheaper phone and internet package.

Perhaps because they have a newer DSLAM, or supplied a new router better able to cope with the 1980’s legacy copper cabling, or even that a lot of people have been away on holidays, there new service has proved stable - perhaps not quite as fast as the previous service, but I’ll trade speed for stability.

Since changing ove we’ve used the internet just as much as previously, and we’ve only ever once lost sync on the copper adsl and failed over to 3G network - if this degree of resilience keeps up we can probably just carry on with a pay as you go 3G usb modem for network backup.

However, there’s still the caveat that today is the first day that kids are back at school and holidays are most definitely over - be interesting to see what happens with the early evening slowdown and what sort of network resilience we get

Written with StackEdit.