Monday, 25 June 2007

Old trains and digital archiving ...

I like railways. Or more accurately I like the social history of railways and the changes they brought, and they were, in their time, as much a world changing technology as the internet. So while I admit to taking pictures of trains when I was twelve, I was always more interested by the station buildings, the posters, the advertising and the changes in people's lives.

It made tourism possible, at least for the middle classes. It made travel possible. One of the more bizarre moments is that John McDouall Stuart, the man who endured terrible privations surveying the route of the transcontinental telegraph line from Adelaide to the north coast of Australia, in the the 1860's, announced his return from the unexplored outback by sending a telegram from the railhead at Burra and getting the morning train back to Adelaide.

So railways were a world changing phenomenon. And their relics are all around, but rapidly disappearing as the world increasingly forgets railways. Equally their social history is also understudied, perhaps because of the unfortunate association of an interest in railways with the sad men who stand at the ends of station platforms in England with notebooks and flasks of tea, collecting engine numbers.

Now I must admit that during the time I travelled extensively by rail for business in England, I didn't pay much attention to this interest of mine. Too close to work, too many other things to do. Since moving to Australia it's become a greater interest if only because one looks at abandoned train stations and realises whoever designed them copied designs already in use in England. In fact following up on this is the sort of project I could imagine myself spending some of my declining years engaged in, after all in encompasses my interests in history, archaeology, bushwalking and in playing with computers and digital cameras. Not to mention a professional interest in digital preservation and archiving

And, I thought, there must be a wealth of material on the web, enough sad buggers who have assembled collections of source material and photographs.

There isn't. As a totally unscientific test I tried looking for pictures of Callander station on the web. Callander was a jumping off point for the Trossachs, a favourite Victorian tourist destination, and had a big white wooden station. I found exactly one picture.

This was puzzling to me at first. It was a popular destination, people must have taken pictures of it. I remember taking pictures of the derelict station sometime in the late sixties/early seventies when I was all arty and into photography the way teenage boys sometime are. Of course I don't have these photographs now, or if I do they're unclassifed and as good as lost, rather in the same way that Roman coins found by metal detectorists and stripped of their context have little historical value.

And then I realised why. The train line at Callander closed in 1965, meaning pictures of the working station must be forty years old. The station stood derelict for some time thereafter, and people other than me must have taken pictures of it, but they're probably in people's sheds and attics gently decaying, and the person who took them dead, or at least pretty old.

Now the site is a car park and there's no opportunity to reconstruct the original building.

And because no-one documented these things some of our history is being lost.

However not all is doom and gloom. In the course of checking this out I came across the website of Great North of Scotland railway association who are actively trying to archive (and by implication, catalogue) their members' holdings as a resource for future study.

Equally at the other end of the world, the State Library of Tasmania has an eHeritage initiative, working with local historical to digitally preserve historical records, documents and photographs to ensure that they don't get lost.

And that's the key. Digitisation without a preservation strategy is valueless. Preservation without archiving, ie adding context to the items preserved is valueless. Properly digitised and preserved they're a resource for future study.

They may seem mundane, but to a first century Roman clay lamps seemed mundane. Now their distribution tells us a lot about Roman trade routes. Similarly by preserving today's and yesterday's common place, it gives us a picture of how life was lived ...

Friday, 22 June 2007

DocX - the nightmare continues ...

Well whatever we feel about docx it isn't going away, especially that Microsoft have now End_of_Life'd 2003 in a move to boost the uptake of Office 2007, which means we need to be pragmatic and come up with a workable solution, which in the case of the Mac, seems to be Neo office. Microsoft's own import filter for the Mac just barfs on my machine but Neo office imports neatly, graphics and all, and lets you export the document in various useful formats.

Making 2003 EoL is of course also going to be a nightmare for multi-platorm sites as docX will start to spread through their windows fleet in a near viral manner causing mayhem to the non-Windows installed base. Sites with large numbers of windows machines will experience a similar problem due to the financial hit upgrading everyone at once will cause. Open Office as a corporate office suite? Reads all your legacy documents just fine. Only problem is that Open Office isn't really integrated into Aqua on the Mac, even though they're working on this.

So for now Neo Office is your friend if you have Macs on site. Does what Open Office does and handles docX to boot. One quirk though is Safari recognises that docx, like odf and like the good old open office format is a zip based format and helpfully unpacks the docx bundle for you. To work round this little problem I resorted to downloading the offending file using parallels and dragging the offending file from the windows desktop to the Mac desktop. Surely there's got to be something a tad more elegant ...

But this isn't a complete solution to the problem of submission to scientific journals I blogged about earlier. At least lets you edit .docx documents. Still it doesn't really handle the problem that basically the equation editor in Office 2007 doesn't use MathML or a compatible format.

And this is important as typesetting equations is hard, and computerised typsetters can have their own quirks. One of the reasons AmiPro was so popular with mathematical scientists when it appeared was that it had an equation editor that produced TeX code and yet was a proper onscreen word processor. Just as the only reason TeX has hung on is that typesetters understand it and what you put in is what you get out.

Now writing a program to parse markup isn't that hard (OK it is but it's doable), which means you can convert TeX, MathML or any markup based document to something a typesetting machine understands (SGML or whatever - one of the wierdest sights I ever so was a commercial printer that had a floor full of people in cubes editing raw SGML in vi on Macs to feed it into a typesetter and fix any conversion problems).

The other key point is that if the document is in a known, or well understood format you can always convert it to something else. docX isn't, the specification is owned by Microsoft and they can tweak it to fix problems which means that you get creep, which is a document conversion engineer's nightmare. ODF and the other open formats have specification documents which you can refer to. Adobe have published detailed specification documents for pdf to allow you to write your own pdf export utilities meaning the format is open in the sense that the knowledge on how to parse it is publically available.

All good. As the world's gone digital, lots of documents, research findings, whatever only exist in electronic form. Yes, people may have printed copies scattered round their offices, in the same way they used to have offprints, but there are no catalogued, findable, non-digital copies.

If the document is in a format that can't be read it might as well be dead, or written in linear B, maya, tokharian, or something equally obscure. If the format's known we can always access the knowledge. And fundamentally that's why docX is a problem. It doesn't follow open, described, standards so there's no guarantee of future access, or when we open a docX document created with Word 2007 we'll see exactly the same document when in 2012 we open it with word 2011

[Addendum: In this discussion I'm ignoring the very similar problems caused by Excel's new xlsx format in Office 2007, but that doesn't mean they're not out there]

Monday, 18 June 2007

keyboards and dishwashers ...

I eat lunch over my keyboard (one of my less appealing habits). Every so often I have to invert it and shake the crap out and I've already written off one keyboard with my unsavory habits.

So I've always been interested in ways of keeping keyboards clean especially ones in public access labs that get kind of yucky after a year's greasy fingered students have done their worst.

Most cleaning solutions include expensive products plus the employment of cleaning staff to come round and do the cleaning. So given that most of the waste in a keyboard is skin, grease and food waste I've always wondered if you could run a keyboard through a dishwasher and then dry it off with some water displacement chemical, eg WD40. Now someone at NPR's tried exactly that. And it does seem to work. Eevn if it probably invalidates the warranty and risks doing damage to the keyboard. But there seems to be all sorts of FUD about doing it. But then when a basic USB keyboard costs ten to fifteen bucks, whats the risk?

If your keybord works afterwards, you've saved $10. If it doesn't your no worse off than you were ...

Sunday, 10 June 2007

[canberra] Magpies after nest material already ...

I have no idea if it's global warming or not, but this morning, when I went out to pick up the paper from the drive there were a bunch of magpies banging about in a eucalypt looking for all the world that they were looking for nesting material.

Given we've had two days of high winds they could be wanting to repair nests, but it's not yet mid winter and they're turning to nest building in Canberra.

Just to add to the conundrum the azaleas and even the jonquils (minature daffodils) are flowering in our garden ....

Wednesday, 6 June 2007

Desmond Morris, a liking for red, and blondes ...

Now we could get all religious and start talking about Adam and Eve and red apples, or indeed do the Desmond Morris 'Naked Ape' thing about red lips and red labia, but it still remains that the evolution of human colouration is an interesting topic.

And blondeness is another aspect of human colouration that looks like an odd mutation. I first bloged about this over a year ago on my other blog: here's a sligly updated version:

BR>Blondes ...
posted Tue, 28 Feb 2006 15:56:02 -0800

I've often wondered about the evolution of blonde hair. It really would have to have had an evolutionary advantage to become common, and anyway, why is it found only (near enough) in populations whose ancestors lived in northwest europe.

Now blonde is a funny set of mutations, plae skin, blue eyes, yellow hair - definitely a bit of a freaky mutation.

Other populations who love in northwest Europe have the pale skin mutation - your classic celtic beauty with milk white skin is probably the result fo a selection for a population that makes the most of the available sunlight to make vitamin D. Very sensible in a population where it's cloudy and rains a lot, and consequently not a lot of sunlight.

I'm going to gues that you don't see such a mutation in northern Japan because they get more UV, eeven if it rains a lot.

But why blonde - a freaky looking mutation that makes them look like a different species.

Peter Frost, a Canadian anthropologist has been wondering about the same thing and has come up witht he theroy that blonde eveloved to make striking look people who were more sexually desirable to our neolithing ancsetors, and so blondes, as well has having more fun, got more food and ended up being more successful at reproducing themselves.

Studies of north west european populations show that there's a welter of variants of hair colour and the mutations of the three hair colour genes date back to the end of the ice age, say 11,000 years ago, when the population was small and the mutation could spread quickly among particular groups.

There's an updated article in the London Times on his research.

More generally Peter Frost seems to have been doing a lot of work on this area and has writtern a book on the evolution of lighter skin colour in humans, otherwise known as 'why aren't we all brownish coloured?'

How good the research is I don't know but it appears to have been accepted for publication in a respectable journal or two so it seems plausible. Even if it's wrong it's an interesting idea.

postscript - aboriginal children in australia often have blonde hair. Nineteenth centrury romantics used to say that this was from shipwrecked Dutch sailors in the sixteenth century, never mind that the aborigines don't have any stories about this while they've lots of stories about other things. What evolutionary advantage would blonde hair give them?

Primate colour vision: was fruit or sex the evolutionary driver?

A long time ago, I used to do research in animal behaviour. And one of the questions that stayed with me was why did promates evolve colour vision. 3D is easy, if you climb about in trees a lot having 3D helps you avoid killing yourself, but colour? Yes you need to know when fruit are ripe but an augmented monochrome vsion would do that as long as there was contrast with the surrounding vegetation.

And then there's birds. A long time ago I got interested in this topic, but really through looking at the evolution of berry colour and foraging strategy in birds.

Basically my question then was 'why are so many berries red' which given that a lot of birds perceive different colours to us is slightly
wierd. And of course 'why are those which are not red blue-black?' Which is kind of interesting as while read are easy for us ape derived beings to see, blue black is markedly less so (and why do some go through a red phase of ripening?)

Certainly birds rapidly learn to associate all sorts of colours with either acceptable or distatstful food objects so there's some
flexibility there, it's not as if the red preference is hard wired. I've done a little proof of concept experiment on this and Ian Soane did a much better on one on 'Why are distasteful prey not cryptic?'.

Fruit bats are also kind of interesting. Like primates they have binocular colour vision and eat fruit and live in a tropical area with
no real seasonality meaning that brightly coloured (ripe) fruit ares something they should be looking for if they want to live well.

So, my question is, which came first? Fruit colour to denote ripeness or colour vision to detect ripeness. And if the former why is our
colour vision so good at detecting variations when really all that's needed is a detction system that says 'fit to eat (or not)'.

And some people at Ohio University may have the answer, or part of it at least. (There's also a fuller report here.)

A preference for red made us evolve a better response for red coloration. Which would also mean that red coloured fruits would stand out more and be selected. (Which is good if you're a fruit, as you get eaten and your seed shat out somewhere else with a nice pat of fertiliser to start some more trees).

But did the red preference come first as a response to the availability of red fruit because birds (and fruit bats) had a preference for them (our augmented redness detector does the enhanced monochrome thing)? And why the hell do we havd difficulty with blue-black. Was red just simply good enough?

Monday, 4 June 2007

What do people really really want?

What do people really really want?

If you run a SoE (standard operating environment) you spend a certain amount to time debating what people need in a the way of a base applcation install.

Usually it comes down to:

  • a text processing application (that interworks with Word [.doc]
  • a spreadsheet that can handle excel files
  • a presetation viewer that can display powerpoints
  • a web browser that supports javascript
  • various viewers and plugins including something to deal with pdf files
  • a mail client (with or without calendaring functions - people swing both ways on that)

and that's about it. You can build a mac based environment. You can build a linux based environment. You could build a thin client environment that does it. You could do it through X-Windows, basically there's more than one way to skin this cat.

But from bitter experience I can tell you that you end up using Microsoft products.


Not because they're cheaper, or necessarily better, though some are pretty slick, but because Microsoft have convinced the 97% of the populace that don't do this for a job that they put the Word in Wordprocessing and that you can't possibly use anything else.

Now along comes a geek with an old 1980's Mac and a super duper up to the minute dual core AMD box who does a comparison between various applications on the Mac and his hot box. Now it's really a bit of fun and not that valid a comparison, but it does beg the question 'how much functionality do you really need?'

And with my retro/recycled computing hat on I've got to say I concur - not that much and not tat much need for expensive bloat ware. The problem is however user education. Because Excel is effctively a synonym for spreadsheet users are locked into the mindset of having to use Excel. And in a sense who can blame them - the shelves at Borders groan with 'How to do it with Excel' books but bugger all in the way of 'Open Office Calc for fun and profit'.

And that's it. People don't care what they have as long as they can get ther job done. People know thay can get the job done with Excel, and know there's a whole backup in terms of training, self help books, web forums and the rest. It's the comfort thing. And a support website run by geeks who no nothing and care less about muti currency cost accounting doesn't cut it - peopel want the reassurance factor, they don't want to be heros, especially where their job's concerned ...

DocX is a nightmare ...

DocX is a nightmare. Work in a site where most people have Office 2003 or else Office 2004 for the Mac and we have a scenario where people can't handle documents from Office 2007 users, which is a bit bad when people collaborating on documents expect to share and revise them.

Now there are converters for the Mac and the PC, but and people working with Office 2007 can save document in the 'old' .doc format. A pain but workable you might think.

But then comes the news that Office 2007's .doc export mode isn't really .doc, well certainly as far as mathematical equations are concerned and as a consequence some journal publishers are refusing Office 2007 files.

Now given that people live and die by journal submissions and citations this is a fairly major problem. Suddenly TeX starts looking attractive, or perhaps some othe common standard such as ODF. (Mind you I'll bet some publishers can't handle that either). This also has implications for the long term storage/archiving of documents in a revisable format - having your equations as a bitmap isn't the best ....