Monday, 10 December 2007

AAF Mini Grant

Along with my colleague Cathy Clegg, I've been awarded and Australian Access Federation mini grant to produce a shibbolized version of lyceum.

There'll doubtless be much more on this in the coming months.

Thursday, 6 December 2007

p2p, lockss and cheap archival storage

a thought ...

if what we want is cheap archival storage could we do it very cheaply by using a lot of minimal boxes running p2p software to distribute the load - should give the same functionality as lockss and with multiple boxes should be failure resistant if not exactly resilient

Put four cheap reasonably big disks in each box ...

How yahoo [does | will do] big file systems ...

hadoop in a word. Very involved in the development of the hadopp file system and I guess it's going to be their googlefs equivalent for content hosting.

I'm also guessing that flickr will be in the [does | will do] box as well.

Which is kind of interesting. And leveraging off opensource save some of your development costs as well.

And as a bit of non sequiter (ok I want to keep track of the url), there's an interesting tale from the NYT about hadoop, Amazon's cloud computing environment adn digital content management

student filestores, unstructured data and isilon

One of the problems that universities have is that they have large amounts of unstructured data consisting of lots of small files that are subject to period of rapid change and churn.

They're known as student filestores, and becuase of their nature they're a pain to administer, backup and do file restores. The backup restore problem is due to the treewalk problem where by any such backup/restore program has to go and find the files and build and search a directory for them. This takes time and on a chatty file system can mean that you never actually build a proper directory as the content has changed by the time that you finished treewalking the filestore.

Obviously there are tricks to get round this, such as splitting the filestores into multiple filestores, say based on cohorts, but it still exists. Most conventional filesystems have trouble with lots of changing small files and consequently most backup solutions do.

Now one way you could do this is to have a database driven solution that tracks when a particular file has been changed, and writes the path somewhere so you only backup new or changed files.

This is a solution we're developing for our student filestore which is based on apple xserve and xraid technology as while we can replicate the filestore and do diffs on the filelists to allow user driven restores, we can't actually back it up - something that we would like to do for DR purposes. So we're developing a database driven system to track changes and build synthetic backups that we can then write out to a volume and backup conventionally.

The alternative would be to use a true metadata and pointer driven filesystem like the google file system, where, while you have to rewrite chunks whenever a file inside a chunk changes, all you need to do is to back up the chunks, which are larger and more easy for conventional filesystem type backups. And you don't need to use the google file system - the fossil/venti combination found in plan 9 would work just as well.

This problem also exists in digital preservation. Digital content for long term storage typically consists of lots of unstructured data and content, with a lot of small files. However as we're doing this for long term storage they don't change, they just get added to.

Most solutions in this space are fairly conservative and rely on conventional file systems for an object store and a database to store the metadata associated with the objects. To this one adds something like SAMFS to store multiple copies of the object store and the individual checksums of the objects, and do some integrity checking to avoid bitrot. Hitachi's Content Archive Platform works like this as does Honeycomb (aka Storagetek 5800) and commercial digital preservation products like digitool from Ex Libris.

And this works fine, because the ingest rate is typically low and there's no churn, which means that whatever storage backend/tape archive/replication solution can cope by doing continuous synthetic backups (or by rsync, rdiff or whatever).

What happes if you're facebook or the Kodak easyshre site?. Users, and there's a lot of them, are continually adding and modifying content and the content consists of lots of small files. And you've got to keep the content for as long as the user keeps on subscribing and you have lots of files. Yes you could quota each user to say 1GB or a 1000 files (for file based backup and replication you're more worried about the number of files, of directory table entries than the sheer amount of filestore) but if you've many thousands of users, it's still a lot of files. Too many to backup conventionally but which you would probably replicate multiple times.

So you could say that the flickr filestore or the Kodak Easyshare filestore would be a close model for a typical student filestore on drugs.

Now I don't know how flickr provide their store but Kodak uses Isilon to provide their store.

So when Isilon came to Canberra to spruik their solution I was interested. Especially as we also potentially have a long term archiving problem with medical images, astronomical images, and astronomical data, as well as having to provide a large live student filestore. Something that would scale to 1600TB is interesting.

And it was interesting. Basically start with three nodes. Stipe the data across them in such a way as to have multiple redundancy across disks within boxes and between boxes to ensure that you could lose either a random set of disks or a box and keep going. Use infiniband as a backplane to glue the boxes together. Additional nodes can be added to increase the amount of filestore available provided you stay within the redundancy rules. And you don't back it up - you replicate it. (As an aside I suddenly realised at this moment why Hitachi had put MAID (like Copan) into their preservation package - most times you never need to access the replicated copies so why keep the disks spinning - simple when you think it through).

And you present it as a contiguous filesystem presented as shares and accessible by NFS, CIFS, HTTP and grudgingly the apple filing protocol AFP). Cost is around $10K/TB plus some extra for some of the replication tools. Not astoundingly cheap, but not ridiculous either.

But I had a niggle. It was all sales and nothng about how the filesystem worked. How do they manage it?. Given its potential size it can't be a conventional inode or cluster based system so I'm guessing it must be a distributed system something like fossil. They said they'd get back to me but they havn't. Ceratinly fossil would give them efficiencies.

And then there's the other problem - googling for technical information I cam across a whole set of entries suggesting that there might be some financial problems in the parent.

On the other side, they did drop a hint that another university in Australia (Victoria actually) was possibly about to buy their solution for medical imagery. They promised to confirm that as well - something else I'm still waiting to hear about.

So, promising, not cheap and a few doubts, but if it works they way I'm guessing it could be a really useful technology for holding large amounts of unstructred data either as filestore or for archiving.

Seattle/California trip powerpoint

I've been asked to do a high speed overview of my trip to Educause in Seattle and the Caudit study tour. It's a personal take, but if you're interested you can download the presentation. It's a little under 1MB in size

Wednesday, 21 November 2007

chimps use tools to find tubers underground

From this morning's Australian:

This time, chimps living in the dry woodland savannas of western Tanzania have been caught digging up roots, tubers and bulbs with sticks and roughly shaped bits of bark.

If chimps can forage underground for food, the same may have been true of ancestral humans, hominids, who had similar brain power and hand shape.


The full text of the article can be found here

Lots of implications for human evolution and the evolution of tool use.

For years I thought that savannah dwelling baboons were a better analogue to our early ancestors as they occupied a similar ecological niche, but now that chimps have been seen spearing bush babies and using sticks to dig for roots maybe being efficient at digging for roots lead them to start exploiting forest margins and then move into open savannah which they could then exploit more efficiently than the ancestral baboons - sticks are probably more efficient than fingers at probing for roots.

This could be neat explanation why a group of forest dwelling apes moved into a savannah based lifestyle, after all there has to be some pressure for the move to take place. It might also explain why our social behaviours can also seem more baboon like than chimp like at times.

Tuesday, 20 November 2007

Debian Etch vs SUSE linux - do we care ?

This isn't a deep ranging technical review, more a sort user perception based review. And it's to do with ease of installation and window managers.

A few years ago I would have argued that kde was the way to go, for the very simple reason it carried the same obvious set of metaphors as XP - start button, menus forked off, right clicks and the rest. Basically you could get kde to look and behave like XP, and if you could do that your retraining costs were minimal, linux was linux, and it was only a window manager, not an OS shell, yadda, yadda, yadda.

Gnome wasn't a contender as it was too alien, didn't quite work the same way, no start button, etc.

I still believe I was right, but since then a couple of things have happened:

  1. Gnome has got subtly better - don't ask me how it just has

  2. Vista isn't the same as XP so retraining costs aren't the same problem

  3. The growing acceptance of OS X and the costs to change

and gnome has become more pervasive. It's on Ubuntu, which is possibly the most widely used desktop linux distro today. And given that user experience is goverend by the desktop probably it doesn't matter too much to users what the underlying distribution is.

Systems architects, software support people probably care but users don't.

It could be Etch, it could be Ubuntu, it could be Suse Linux. Users don't care. Installing software under either is easy, they all use a repository model, and you need privileges to do it. Again most people don't care as long as they can get web, email and some office prodcutivity going. Wierdos like me like to install kate or kwrite, and perhaps some programming/scripting capability, but that's a minority sport.

And as all these distros come with sensible default software configs they probably don't ever need to install anything else.

So the difference comes down to the installers. SUSE's is nicer. It's graphical and crucially it handles that ever so tricky disk layout question much more nicely than Debian (or Ubuntu). Separate partitions, nice easy ways of fiddling with them if you want and no being dropped into something nasty.

So SUSE is nicer as a user installation experience. Otherwise it could be Etch, could be Ubuntu. All are easy to use and be productive with, and the user experience is essentially identical.

And I'm learing to love Gnome ...

Monday, 19 November 2007

blondes mess with men's minds

following on the theme of why blondeness evolves, there's a report in today's Australian to the effect that men tend to unconsciously dumb down when talking to blondes - the inference being that the stereotyp of the dumb blondes is alive and well ....

Tuesday, 13 November 2007

Windows live ...

I've been playing with Windows live, which given my interest in web 2.0 technologies kind of makes sense, but when you compare the environment with Google Apps or Zoho doesn't really.

Why so?

Well we're in the participation age where we care and share. Seriously it's important, and we're investing a fair amount of effort in sakai as a collaboration platform to work arund the fact that Australia is on the dark side of the world, meaning that if we want to interact with people in northern hemisphere universities it's got to be asynchronous as they're asleep when we're awake, and vice versa. As I've said elsewhere it's the tyrrany of time zones.

Collaboration sites have also turned out to be relly useful for committees and projects - lodge the documents and other relevant information eg meetings notes on a closed site and a project can proceed really well, with everyone always having access to papers.

Anyway, while we can see the use of collaboration tools on both a wide and local area basis it does have implications for teaching and learning when students can share material easily and exhcnage documents really easily, not to mention publish things in blogs, and so on.

But they can do this any way, so rather than agonise let's embrace and pretend we're doing this to teach group working - certainly putting students in groups for assignments teaches management skills.

Anyway, enough of this. Google Apps and Zoho both provide a means to share documents as well as some document creating tools and some online storage. Makes sharing and publishing to the web easier, gives you access to your files anywhere you have a browser (as long as it's ie, or a mozilla variant like firefox, camino or icewasel), and makes it easy to email them.

Now, because we have labs full of computers that run software this doesn't seem so great a deal. If you don't however it's really useful - means all you need to access and modify your documents is a browser, which means that even with a web browser on a sun ray (or an old knackered mac G3) you can edit and share even if you don't have any of the standard tools. So like with thin client stuff, we're abstracted from making assumptions about the hardware.

Windows live is different. Sure you get web based email but a lot of the functionality is based on running lightweight desktop apps, which immediately starts making assumptions about host capability and capacity, and one's suddenly lost that martini (anywhere, any platform) capability. Though if you've a recent pc with xp or vista you probably don't care.

If you've a mac, or gasp, are running linux, you probably do.

So why the interest?

Both Google and Microsoft offer bundles to eductaional institutions, essentially allowing them to outsource their student email, and also to provide an alumni email service supported by ads. They also both provide calendaring. Given the cost of providing these services, outsourcing them effectively for free is an interesting option, especially as you can brand them your way.

Of course it's not free - there are network traffic implications and you need to maintain some infrastructure but it does mean the problem of providing student email goes away and they can easily keep ther account as an alumni. (If you make it difficult to get an alumni account students don't bother, they just use gmail, hotmail or yahoo with instant sign on and usability).

Now webmail is webmail, but probably you might have got the impression that I think Google Apps is possibly a better offering than Windows Live. Well I do, but given we'e already got a collaboration service in Sakai, everything outside of email becomes a nice to have rather a must have - so Google is ahead on points, not winning the race.

Microsoft's offering looks nicer (more facebook like - yes I'm playing with facebook as well to get my head round it), probably appeals better to less technical users, but again that's not a showstopper.

But Microsoft does have a potential show stopper - Exchange Labs which allows the integration of windows live email and calendaring with your local exchange installation. Given that a lot of universities have some sort of exchange deployment for staff (usually not students - too expensive in terms of infrastructure and licenses) this means that staff and students can share calndaring informtaion easily, making managing tutorials, assignment dates and meeting rooms and so on out of exchange really easy and potentially it just looks like one big exchange installation. That is interesting as suddenly a whole lot of calendar integration problems go away ...

Alternatives, such as Apple's new calendaring solution in leopard server and Bedework are untried, even though as standards based they should be easier to deal with. Google Calendar of course only does push and not sync out of the box although you can integrate other calendars abd use tools like spanning sync to sync back.

So if you wanted to make a decision on outsourcing student email and using either google apps or windows live as a platform to do so your two questions are:

- do you need a shared calendaring solution, and have thought through what you want it to do for you?
- do you want exchange integration and why?

Friday, 9 November 2007

trips, earthquakes and nda's

After Educause there is life, or at least something resembling it. In this case a whole lot of vendor visits, all of which are under nda so I can't really blog about them except to say that I've been to

- Microsoft at Redmond
- Apple at Cupertino
- Google at Mountain View
- VMware in Palo Alto

all of which was kind of cool. VMware's building was opposite the original Xerox PARC which was also pretty cool. Now as Microsoft are in Redmond, which is a suburb of Seattle and the rest of them are back in the valley (ie Silicon Valley) this involved a flight from Seattle to San Jose.

And on the flight I sat beside a guy from Microsoft who was not impressed by Vista and it's performance. as it was a private conversation I won't name names or reveal details but let's just say it was pretty honest and pretty direct. Let's just say some engineers prefer XP over Vista.

Just after we landed in San Jose there was a magnitude 5.8 earthquake. Not that we felt it in the back of a banged up shuttle bus bouncing down the freeway - just another bump in the road. Nothing at all like listening to Grace Slick belting out 'When the earth moves again'. When we got to the hotel everyone was still standing about, half afraid it might be a precursor to the 'big one'

In the event it wasn't even though aftershocks continued the next day.

In the middle of this we went to Stanford to see what they were doing with educational technology, which was kind of provocative, not because it was massively hi tech but because of the very clear vision that they had that learning technology was there to build engagement, and that it was a set of enabling technologies, not an end in itself.

Two other interesting points were that while 92% of students own a laptop at Stanford, not a lot of them carry them round campus on a regular basis - battery life and weight tend to make your sexy macbook seem like a boat anchor by the end of the day.

The other interesting thing was that anyone at Stanford (staff and students) can hand out a giest account to anyone else to allow them to have web access - really as a way of getting round not having any infrastructure like eduroam in place.

And while we were at Stanford we took the opportunity to catch up with the CLOCKSS people - the ANU is about to sign up to become members of the CLOCKSS project and we felt it would be useful to touch base with them and ensure that our undertandings were aligned.

Like all such projects they're amazingly small - six talented people doing wonderful things with long term digital preservation and a digital content escrow service - I'll blog about CLOCKSS and our role in it separately in another post.

Monday, 29 October 2007

Educause ...

Educause, for those of you who don't know, is the biggest university IT conference in the US, and for that reason draws people from all over the world, including me.

In fact it's probably the largest in the world and as a consequence provides an unrivalled opportunity to meet vendors and discuss things, see new cool and sexy things, etc.

Also allows people to present papers on things they've been doing, etc.

So if you work in IT in universities, sooner or later you end up at educause. In my case it took me more than 20 years, but that's because I worked in the UK for 17 of them, and the UK does its own thing (some of which I helped organise [shameless plug for past career]).

So Educause. Theme was all about collaboration, often targeted towards VLE's (Virtual Learning Environments - a much better term than LMS, Learning Management Systems), and by extension identity management, after all you need to know what roles, or in a student context what courses, someone is signed up to to manage access.

There also was a great sense in which US universities were grappling with the web 2.0 phenomenon, either in how to engage the attention deficit generation and stopping them going off and using resources outside of the course material, or indeed how to build wiki's blogs and the rest into the learning experience, especially the desire to provide a social networking environment.

One got the distinct impression that a whole lot of universities were not enamored of the participation age, and there was a great difficulty on adapting to group share rather than individual efforts.

Along the way there were some interesting snippets:

Email outsourcing:

Microsoft and Google locked in a battle to handle outsourced student email and calendaring. The intereting thing is that Google had done a deal with Umea in Sweden, and given that Sweden is kind of sesnitive about personal data that was good.

Microsoft were handing out windows live test accounts. I'll blog on this separately once I've played with windows live, but it's an open secret that windows live will provide exchange integration allowing faculty to stay on locally managed exchange and yet integrate with the students on windows live.

yahoo and by extension zimbra were not making a pitch for anything although it's my guess they're trying to catch up.


keep notes and references? used endnote? tried furl, digg and the rest, try zotero. A little firefox plug in that lets you track references and cittations, personal archives of websites and the rest.

What's more, the coming server edition will allow you to build server baswed portfolios of what you're working on for input to varios reporting processes as well as sharing with colleagues in the field. Google for it. It's worth it


the conference was kind of uneven. Some presentations were better than others. Most were fairly mediocre but the trade show was good.

Overall pretty interesting.

Monday, 24 September 2007

Web pages on the move

not only am I moving house, my web pages are on the move as well., which used to provide me with a shell account are shutting down.

I've moved my pages to, an altogether a less classy address and one supported by nasty intrusive advertising, but then not everything in this life is free :-( (and I had to do it in a hurry)

My pages will move again if I can find a better home for them

Tuesday, 18 September 2007

global crash ...

now those of you who havn't been paying intention may have thought that the global economy had righted itself.

Not true.

Take a look at the Northern Rock fiasco in the UK. Last night the ABC showed an equally dire warning about the impact of the US sub prime mortgage crisis.

Very grim. Makes the tech wreck look like a christmas party.

But interestingly they failed to mention this simple and slightly frightening piece of logic:

US mortgage holders in part re financed their houses to buy additional consumer goods

The demand for these items in part fueled the staggering economic growth in China (where else do you think large screen tv's and ipods come from?).

This economic growth led to the mining boom in Australia

This mining boom led to our (Australia's) recent stellar economic performance and let us ride out the drought and other economic problems. It also allowed us to but these self same consumer goods given we don't make anything in Australia any more

So if the US doesn't buy consumer goods from China, the Chinese economic miracle stops suddenly. This stuffs Australia and leaves us on a very dry continent that can't currently feed itself due to the drought and no way of buying imports as no one then wants our mineral exports.

As they used to say in university exams - discuss ...

Friday, 7 September 2007

OOXML, ODF, PDF and long term preservation

There's been a lot of talk over the last few days over Microsoft's attempts to get OOXML - basically their new xml based document format set, which includes docx, the new and non backwards compatible document format for word.

Now there's a lot of argument about OOXML and ODF (the open document format used by Open Office among others) as a standard for documents and by implication their long term archiving and accessibility, because let's fac it no one cares very much what format an ephemeral document is in as no one plans to access it three years down the track.

But since the world's gone digital we do care about the electronic documents as they might include important things like contracts, where what you want to be is assured that what you are seeing in 2011 is what you filed in 2007.

This is where thinking about pdf gets to be useful. PDF arose in the days of multiple word processing formats to provide a common, non revisable document format. Basically a pdf file is a munged postscript page description file such as a printer would interpret to put marks on paper. It's good and it works. And Adobe did something remarkable. Even though pdf reamains a proprietary format Adobe published the format specifications, and subsequent revisions and were happy for people to write tools to create and display pdf files. Which is why we have xpdf, preview, ghostscript/gsview, and a number of other third party tools. Adobe reckoned that this would make the format much more widely adopted and that they could make enough out of selling their pdf creation and manipulation tools. Hell, they even give away the viewer for free.

And revision after revision Adobe have continued to do that. And they have published a long term archival version, PDF/A that is essentially PDF 1.3 as the lowest base version universally supported that all their software will always be able to read.

So we end up with a de facto open standard, and a file format we know that can be read even if Adobe was to disappear, and to be frank, we trust Adobe to do the right thing purely because they always have done so in the past. And as a further display of confidence building, Adobe has submitted PDF 1.7 as an ISO document standard, in other words cementing it's status as a publically available and well known format.

So ODF is in a similar position. It's well known, open, documented, and consequently you have a high degree of beleif that it is possible to write a program to parse and manipulate ODF format files. Furthermore it has been adopted as a file format standard providing that degree of assurance as to what ODF compliant means and also for the long term accessibility of files in that standard.

Then we come to OOXML. OOXML is basically Microsoft's response to ODF. As part of their attempt to maintain their market share they are attempting to turn their document formats into a set of standards. Of course by doing this they have to open up their file format and als oy means that they can't tweak and fiddle with the format they way they have done between various version of word say.

The arguement is really about how tight the standard is. Microsoft have a reputation for being assertive in preserving their market share and in the past have tweaked file formats to give their products an advantage. Microsoft's draft standrad has been rejected because of alleged ambiguities in the draft, and the unspoken worry that Microsoft will carry on in their own monopolistic capitalistic ways by exploiting these ambiguities.

If the standard is tightly enough drafted to avoid these loopholes the problem goes away - if Microsoft tweak things, they like any other software manufacturer in the same circumstances would be in breach of standard. It wouldn't matter that they had originally authored the standard.

Of course Microsoft could silence all the critics by adopting ODF but realistically that won't happen.

The bottom line is that people don't trust Microsoft on the basis of past performance and some of their more gross abuses of their monopoly position. That's sad, but that's how it is. It also doesn't mean that Microsoft is intrinsically evil, the computer industry is littered with examples of similar abuses of standards and formats, it's just that Microsoft have been around for longer and have been a bit more blatant in their behaviour.

You might notice that I've never mentioned XML in this piece. That's because it's irrelevant to the discussion. The concern is about how tightly drafted the standard is and how open it is, not how the file format is encoded. The XML-ness of the format is a red herring as is shown by the success of the postscript-like pdf format.

Tuesday, 4 September 2007

Scrybe ...

as is well known, I like playing with online calendaring and diary management solutions. Well, I'm probably late to the party, but I've just come across Scrybe which looks to be an absolute killer.

Only problem is that the beta's currently closed to new users so I can't do a contrast and compare with Zoho, GoogleCal and Remember the Milk.

However, it's worth checking out the websit and the you tube video to get a flavour of what it might be able to do for you ...

Plone as a software archive manager

One of my colleagues, who has been playing with Plone recently has come up with an idea that is absolutely breath taking in its simplicity:

Use plone, which is a web content manager as a software archive. You can upload disk images as documents, add descriptions and structure the information. Given that what we wanted out of the software archive was a solution that, to users, looked a little like but was easy for the people managing to add information about packages and software to it - seems ideal

After all we're just managing content.

Wish I'd thought of it (green with envy)...

Thursday, 23 August 2007

Zoho does Gears ...

Zoho has added Google Gears support to Zoho writer, meaning that their web based wordprocessor can now work offline and away from the internet.

This is an interesting development, but one that buys a lot of (potential) problems:

  1. If you want to work offline on a document why not simply export it to the word processor of your choice and then re-import it. Granted, using Gears allows you to keep the same interface, but that doesn't quite cut it
  2. How do you do your version control. Re syncing when one person is editing the document is fine, but when its a collaborative edit how do you resync - after all someone may have edited the document after you've modified it offline

These are major questions. Anyone who has ever worked on a document collaboratively knows that version control and syncing can be a nightmare.

So why would they do it?

There doesn't seem at first sight to be a compelling business case. Too tricky, too difficult and breaks collaborative editing, which is their unique selling proposition.

Well even with mobile internet, the internet isn't always on everywhere. In fact in a lot of places it simply isn't on at all. But it still begs the question why offline when you can export and work on with a standard word processor. After all good as Zoho is it's not as fully featured as Star/Open Office or Microsoft Office.

But then there's already devices like the Easy Neuf, basically an internet access terminal with some local compute and software execution. Now take it one step further and imagine a lightweight (in computing terms) portable computing device, sort of a mobile Easy Neuf. Could be a phone with a keyboard, could be someting like an OLPC, could be something we havn't seen yet, but the key thing is that they have some limited processing capability yet are supposed to have persistent internet connections.

And if it's portable, that's what it doesn't have. So suddenly you work on the plane, on the train, and then resync as soon as you have connectivity, without having to carry a fully featured computing device. And that is very interesting because the one thing we know about fully featured computing devices, aka laptops is that the battery life is crap. A pda with a keyboard does much better ...

Tuesday, 21 August 2007

girls _really_ do like pink

this has turned up in a couple of places, eg new scientist and guardian websites.

A couple of researchers at Newcastle University in the UK tested what colours people like best. Interestingly most people liked blue best but there was a strong prefernce among women for pink and pink-like colours. This also seems to be culture independent as they also reran the test on a group of Chinese migrants who grew up at a time when pink girlie things were not widely available, if at all in China, and they showed the same preference mix.

The hypothesis is

  • in a hunter gathering society women do most of the gathering. Liking pink may give an advantage when selecting the riper fruits and berries, and we do know that there is a strong preference for red and possible evolutionary reasons for it. (see my posts here and here)

  • the preference for blue could go back to when we were savannah dwelling apes and the bright blue sky meant good weather and blue water was possibly good clean water. Well it's an idea anyway, if not quite as convincing as the pink preference hypothesis

There's at least one thought experiment here - do traditional desert dwelling Australian Aborigines have the same preferences? Or only one of them?

It would be really interesting if for instance the female preference for pink was demonstrated but not the overall liking for blue?

Why? Well the interior of Australia is hot dry and red, a hot relenting blue sky signifies drought and there isn't a lot of water around. Yet the ancestors of the peoples who became both the Han Chinese and the races of western Europe speaking Aryan languages (and Turkic speakers for that matter) all started out on the central Asian steppe where a blue warm sky meant warmth and good weather.

And I guess it might also explain the Chinese belief that read was a lucky colour, the colour of prosperity.

Certainly some interesting hypotheses to play with ...

Friday, 17 August 2007

GooglePack. Google Docs and changing the world

There's been a lot of (virtual) ink spilt on GooglePack including Star Office and what it means for Google Docs.

My take on it is fairly simple. They're complementary. It allows people to work on documents offline and then post to google docs, share them and all the rest of the collaboration age stuff. It also allows people to work on documents anywhere, even if they are using a shared computer in an internet cafe at an airport, and then finish them off working on their own computer, then repost, republish them.

And crucially, Google Pack represents an easy simple way of getting star office/open office onto pc's. Easy auto install, no buying or downloading media, no burning install cd's. Now you can do it witha click of a mouse and twenty minutes connectivity.

So what would this mean for student computing (see my previous post for background)

Basically the way I see it is as follows:

Students (almost) universally have a computing device
Google ( are giving away Star Office, and Aqua Open Office for the mac is not far away

=> there is no reason to provide a basic word processing/spreadsheet/presentations/web any more
=> google pack gives us a support free deployment for this software
=> google docs, zoho office fill the gaps
=> webdav provides maintainable filestore to allow students to save their work to university systems that are backed up
=> and the lms/vle and collaboration facility allows students to do most tutorial work

and this means

the computer labs increasingly become a specialist facilities environment
nasty packages can run on thin client / vm’s
ACE or whatever can give access to computing environments
all we do is mandate the formats submissions are made in, not the package
students largely are self supporting
students access most facilities via their own connection (as at a number of mainland European uni’s, eg the Sorbonne)

and these few students without access to a computer?

My pat answer is to set up an ANU computer recycling project (yes, I’ve done this before {} to solve a similar problem) and also incidentally enhance the institution’s green credentials, and reputation as positive caring institution etc

Possibly a tad radical, but I actually think it would work as a model and solve a number of our problems.

So GooglePack gets us Star Office into the environment. Also once we've Star Office out there and built a Star Office culture we can perhaps look to a linux base desktop environment and thus reduce licensing costs further. After all if your word processor, browser, mail client are the same do you care?

I think not. They even don't have to be exactly the same. The resurgence of Apple in the university computing environment suggests you only need to be complementary as long as it works, is reliable and predictable.

Wednesday, 15 August 2007

internet killing culture ? [followup]

just by coincidence, Graeme Philipson had a slightly different take on this in yesterday's Sydney Morning Herald (or the Age if you're in Vic)

Tuesday, 14 August 2007

does the internet kill culture?

There's recently been a thread on the Guardian's blog pages about whether the internet is killing culture - all based around the theory that all these bloggers writing crap and wanna be thrash metal stars on MySpace are clogging up the cultural bandwidth.

All the cultural establishment are getting all huffy that people aren't using the orthodox record labels and book publishers and doing it by themselves.


People have always done this and always produced crap. Look at the number of dire magazines available, derivative mass market paperback novels and the like. There has always been a lot of rubbish out there, just as in the punk era when technology allowed bands to produce their own music on vinyl and cd there was a lot of not very good stuff in circulation. But the not very good stuff never really made it.

Like blogging. I blog to write down my ideas and practice writing skills. I won't pretend it's deathless prose and never expect to get famous from it. One day I'd like to do some serious writing. Just now I'm blogging with a purpose.

And if other people think what I write is sensible I might get an audience.

Now we know that sites are full of teenage rants and right wing hooehy. Well someone has to start somewhere and everyone is entitled to their views (I'm with Voltaire on that one).

I don't think a lot of _literary_ writing and cultural commentary is a lot better than some of the blogs I've read, the only claim to authority in the more conventional publications is often not much more than it's written by someone who claims some cultural authority because they went to Oxford, slept with some famous artists and once had a small volume of poetry published, and have brown nosed the right people.

Well I went to St Andrews, once peed beside John Maynard Smith, and have been among other things chair of the UCISA systems forum. I think I have as much right as anyone else to have my views heard.

Oh, but I didn't brown nose the the right people ...

No more student computer labs ?

Student computer labs fascinate me. Actually they don't, but a large part of my professional life has been tied up in their provision and the facilities provided.

However I'm now beginning to wonder if the way we think about student labs is outdated.

When student labs first came about it was in the days of timesharing systems and all one needed to provide was several rooms full of Wyse WY-85's, VT220's or whatever as all the software was on the timesharing system.

Then came the rise of the pc and the network. Originally universities provided labs full of pc's as they were expensive and crucially the software was expensive. Not to mention the fact that computer lab provision was one of the metrics used (in the UK at least) to assess teaching quality.

Over the years people have tried thin client in various forms, but it's never really taken off, in part because Citrix licensing makes the cost of a large scale deployment prohibitive, and desktop pc's were cheap.

The world has now changed, due to the rise of the laptop and the wireless network, not to mention cheap broadband. Suddenly the idea of a thin client/web 2.0 environment seems attractive, especially as students universally have access to a computing device of their own and some sort of network connectivity, but due to the need to work increasing numbers of hours to fund their studies, can't drop in to use a general purpose student lab.

What they need is access to some basic tools, and a compute/execution environment for the specialist and expensive software they need to use. As students almost universally have access to a computing device and network connect we don't even need to provide the hardware to run the thin client on - a software client such as citrix's ica will do fine.

And with the rise of virtualisation and technologies such as the vmware player we can potentially give students pre-rolled environments to work with.

Possibly high end cheap printing is also a requirement, but we already know how to do that.

So suddenly we're talking about providing services not facilities. Of course there will also need to be small labs of high end specialist hardware, but really for the bread and butter stuff we're talking about providing access, and actually suddenly our lives become much easier - need a new app?, roll it on a vm. Apps don't play nice together? no worries separate them out to separate thin client sessions.

And of course suddenly we don't need to worry about having kit stolen from open access labs, hardware refresh and maitenance and the like ...

Friday, 10 August 2007

docx and apple iWork

unless I've misread things, I get the distinct impression that iWork08, the new updated apple office suite doesn't handle docx either, leaving us with neo office or neo office until the aqua version of open office appears.

Given that the new version of Mac Office appears to be delayed to January this continues to make working in a multi platform environment just a tad challenging ...

Retro computing (ii)

Like I said, I found a cd drive for the old wall street g3 powerbook I'd acquired, which was pretty cool. Borrowed an old Jaguar (os x 10.2) install set and hey presto! forty minutes later I had an old, slow g3 running an old version of os x.

and there's the rub. jaguar really doesn't cut it. Most of the obvious common freeware doesn't work, eg no text wrangler, no neo office, no journier, not even aquaemacs, so I'm kind of stuck for a basic set of tools.

What did install was:

  • Camino 1.04 (means I can use google docs etc)
  • abiword 2.4 (loathsome, but hey it's a word processor)
  • nvu ( a bit touch and go but it runs)
  • mac emacs ( the old version, no sexy interface)
  • vim ( the old version for 6.2)
so I've got a writing/blogging machine out of it, which is a lot of what I do. I've got an old version of acrobat and ie, not to mention applemail, and I can always ssh from the terminal to another machine. What I am lacking is a spreadsheet (but google docs or zoho could perhaps provide that functionality) and a presentation tool (zoho probably).

So really what I've got is something that's kind of like a web 2.0 thin client. Put that way, it might be an interesting experiment using it ...

Wednesday, 1 August 2007

Mobile printing and a web enabled mobile phone ...

If we use the framework proposed in my previous post we can easily extend this to printing via mobile phones. Teppo Raisanen in his paper on a framework for mobile printing proposes a scenario whereby people use their mobile phones to print documents that they retreive from their filestore.

Now, in a managed student printing environment we don't want students randomly walking up to printers and blatting jobs at printer's IR ports but we can do something clever that allows them to access their filestore and then select a file for processing.

Imaging the following:

We provide a webapp that allows the students to see all the files in their filestore. The ones that we can print are not greyed out. The easy ones are pdf and text as we know how to do that. However if we use something like open office as a backend processor we should be able to open Open Office in command line mode and batch convert the document to pdf and submit it through as before. Formatting may not be perfect in all cases but the student can create a ps spool file from their mobile phone which they can release to printing at a time and place of their choosing.

And that's probably a good enough approximation to Teppo Raisanen's retrieve anywhere / print anywhere scenario for most practical purposes.

[Addendum - while researching this I came across who offer a web based on demand conversion service - cool!]

mobile printing ...

Printing, the joy of any IT manager's life. Except this time it's mobile printing.

Well of course most people don't actually know what they mean by mobile printing. It's got the word mobile in it so it must be really trendy and important, and because it's trendy we need to provide the service.

Actually in a university context, it's quite clear what people need.

Students need to be able to (a) print to any student printer on campus, not just the one in the lab where they're situated, and (b) print from their laptops to student printers.

Option (a) is really easy. We did this at York years ago. We create a virtual print queue which puts the spooled jobs for a particular student into a directory named by that's student's user id. Jobs are held in the spool directory for some arbitrary time, say seven days. Student goes to print location, and logs into a print station (really an old pc running linux) which automatically fires up an application that lists the print jobs in their directory and gives them some simple choices (print colour, print b+w, double sided, and delete). Students can print selectively when they want to, where they want to and pick up their output. No wrangles about missing pages, no forgotten printouts littering the place.

It's good, it worked.

Option (b) is slightly more tricky, but if you have (a) in place not terribly so. Pharos have a commercial solution, but it's for XP and Vista only and involves installing drivers on student's laptops. It's also quite flexible and involves having the spooling operation take place on the laptop.

I have a better idea.

First of all we give students a print to pdf app. Mac and linux users can do this out of the box (ok Mac users can, linux users need to fiddle a little) but windows users need to create a virtual printer. Students then print their documents to pdf. Nice thing is that they have the opportunity to check the layout.

That's part one. Part two is to generate a web page with a file upload feature. Students connect to the web page, choose the pdf file they wish to upload and click ok. Behind the scenes we do some simple checks using something like jhove to make user the file really is a pdf file (otherwise we dump it and tell them what we think it is) and then with a bit of ghostscript generate a postscript print job we dump in their spool directory to allow them to print next time their on campus. As we only allow pdf and automatically convert everything to postscript they won't be tempted to use it for extra storage. But they can print their files next time they're on campus and they can make that decision when they're on campus rather than printing the job now and hoping the output will be there when they want to collect it.

The other joy is that they can always print the file at home, and that as the web upload is just that, an authenticated web upload they can be anywhere on the planet, rather than connected to the campus network in one way ot the other to do this.

Ok that's the idea. Now it needs to be refined developed, finessed and something built...

YouTube wierdness in Canberra ...

Canberra, like most of the south east of Australia, is either in drought or just coming out of drought. And by drought we don't mean a few months of a pissy hosepipe ban like you get in England, we mean 5 years of drought where pasture turns brown and dies, trees die and nothing grows.

This makes life hard for farmers. No crops, cattle and sheep feed is expensive, and a lot have gone to the wall.

Now Canberra, being the bush capital, still has working farms on areas that have not yet turned into suburbia. Some are real farms, and some are basically just horse paddocks where fat teenage girls care for fatter ponies. However on the rare day I drive into work I pass a working farm with cows and so on.

A few weeks ago a sign appeared 'please feed the cows bread' - not your usual 'please don't feed the animals'. Obviously the farmer was trying to keep his herd going even though he couldn't afford much in the way of feed.

Then a few days ago an extra piece of board arrived on the sign - just tacked on with a url

And yes there it is on youtube - the whole story.

Shows how community politics and campaigning are changing under the influence of YouTube etc

Tuesday, 31 July 2007

Where's the York Windows OS survey data gone ?

On my web site I've an informal bio that makes mention of the work I did putting together surveys of windows 95, 2000, NT adoption in UK universities.

These surveys were pretty well regarded in their time, and I know that various people in the UK HE computing community made use of them, not to mention Microsoft UK themselves for marketing purposes.

They are now clearly of historic interest only (unless, of course, you're going to offer me a job and want proof of organizational ability ;-) ).
The web pages were hosted at the University of York, my employers at the time.

Well I haven't worked for York for four years now, and no one else there, or elsewhere in the UK, has taken them over so not surprisingly York have finally sent them to /dev/nul.

If you are looking for the survey information it seems to have been archived on the wayback machine. There are a number of copies, and as not much happened to them after 2003 any of the last few copies should be accurate.

Click here for the entry point to last copy archived.

Monday, 30 July 2007

Retro computing (again)

I've acquired an old G3 MacBook - a Wall Street according to lowendmac.

Initially I was going to put linux on it, but then it's an oldworld Mac with the closed firmware and yellow dog needs bootx to start and it's all a bit of nightmare to install, so I thought I'd try Jaguar.

Problem - no CD drive. Not going to happen. So I'm forced to start thinking what can i actually do with an OS 9 Mac. Fortunately it has some basic internet tools and stuff so it can be used as a terminal, and it still has some of the apps installed on it.

Possibly just possibly I might still have my old claris word floppies from the legit copy I bought years ago, which woud help turn it into a basic writing machine for offline blogging and the like, because whatever you say about it, it is a nice machine.

And that's it. Perfectly usable, useful if dated. Seems a shame to trash it. I'll keep you posted if I start using it for something purposeful

Wednesday, 18 July 2007

Adding a calendar application to fluxy ...

As a final tour de force I decided to add a calendar application to my virtual low memory/ubuntu/fluxbox/icewm machine. My first thought was orage but while it installed it didn't play nicely, so I went for something rather more heavyweight. korganizer. Coupled with my orage synchronisation script from January (plus a couple of edits to get rid of the xfce/orage specific references) it worked just fine.

This begs another question - if I can get kpilot to work can I get it to sync with my old palm pilot? - I forsee fun ahead when I build up the real box

Tuesday, 17 July 2007

Prototyping a lightweight linux box

At home, wrapped up in a plastic bag I have an old P2 400MHz machine with 64MB RAM that we used until recently as an alternate dial up machine. Has Windows 98, and somewhere I have a spare no name ethernet card that will work with it. I've also had a hankering for a linux machine at home, but even now linux is a little bloated and a lot of distros don't run comfortably in so little memory.

However in theory you could build ubuntu to run in that little memory with an alternate window manager. Well, I didn't build it on the old clunker, but using parallels I built a custom low memory vm using the alternate install cd for ubuntu 7.04 (feisty fawn) to build a command line only system (there's an ubuntu guide that walks you through this).

To this I added:
  • window managers
    • icewm - preferred
    • fluxbox
  • editors
    • gedit
    • Kwrite (my favorite) & Kate
  • applications
    • abiword
    • firefox
    • icepodder
  • doobries
    • xfe - file manger
    • dillo - lightweight browser

    Notice - no mail client. Theory is that firefox, while a bit slow will perform well enough to work with gmail and by keeping things light the system should perform well enough. If mail is really dire there's always mutt. As a test system it seems to hang together nicely. I can edit (this is being written with kwrite on the vm), print, wordprocess, and surf the web, not to mention download podcasts, which is all I want to build the system for.

    In emulation it seems fine. Next question is how well does it perform on genuine 1999 hardware?

Monday, 25 June 2007

Old trains and digital archiving ...

I like railways. Or more accurately I like the social history of railways and the changes they brought, and they were, in their time, as much a world changing technology as the internet. So while I admit to taking pictures of trains when I was twelve, I was always more interested by the station buildings, the posters, the advertising and the changes in people's lives.

It made tourism possible, at least for the middle classes. It made travel possible. One of the more bizarre moments is that John McDouall Stuart, the man who endured terrible privations surveying the route of the transcontinental telegraph line from Adelaide to the north coast of Australia, in the the 1860's, announced his return from the unexplored outback by sending a telegram from the railhead at Burra and getting the morning train back to Adelaide.

So railways were a world changing phenomenon. And their relics are all around, but rapidly disappearing as the world increasingly forgets railways. Equally their social history is also understudied, perhaps because of the unfortunate association of an interest in railways with the sad men who stand at the ends of station platforms in England with notebooks and flasks of tea, collecting engine numbers.

Now I must admit that during the time I travelled extensively by rail for business in England, I didn't pay much attention to this interest of mine. Too close to work, too many other things to do. Since moving to Australia it's become a greater interest if only because one looks at abandoned train stations and realises whoever designed them copied designs already in use in England. In fact following up on this is the sort of project I could imagine myself spending some of my declining years engaged in, after all in encompasses my interests in history, archaeology, bushwalking and in playing with computers and digital cameras. Not to mention a professional interest in digital preservation and archiving

And, I thought, there must be a wealth of material on the web, enough sad buggers who have assembled collections of source material and photographs.

There isn't. As a totally unscientific test I tried looking for pictures of Callander station on the web. Callander was a jumping off point for the Trossachs, a favourite Victorian tourist destination, and had a big white wooden station. I found exactly one picture.

This was puzzling to me at first. It was a popular destination, people must have taken pictures of it. I remember taking pictures of the derelict station sometime in the late sixties/early seventies when I was all arty and into photography the way teenage boys sometime are. Of course I don't have these photographs now, or if I do they're unclassifed and as good as lost, rather in the same way that Roman coins found by metal detectorists and stripped of their context have little historical value.

And then I realised why. The train line at Callander closed in 1965, meaning pictures of the working station must be forty years old. The station stood derelict for some time thereafter, and people other than me must have taken pictures of it, but they're probably in people's sheds and attics gently decaying, and the person who took them dead, or at least pretty old.

Now the site is a car park and there's no opportunity to reconstruct the original building.

And because no-one documented these things some of our history is being lost.

However not all is doom and gloom. In the course of checking this out I came across the website of Great North of Scotland railway association who are actively trying to archive (and by implication, catalogue) their members' holdings as a resource for future study.

Equally at the other end of the world, the State Library of Tasmania has an eHeritage initiative, working with local historical to digitally preserve historical records, documents and photographs to ensure that they don't get lost.

And that's the key. Digitisation without a preservation strategy is valueless. Preservation without archiving, ie adding context to the items preserved is valueless. Properly digitised and preserved they're a resource for future study.

They may seem mundane, but to a first century Roman clay lamps seemed mundane. Now their distribution tells us a lot about Roman trade routes. Similarly by preserving today's and yesterday's common place, it gives us a picture of how life was lived ...

Friday, 22 June 2007

DocX - the nightmare continues ...

Well whatever we feel about docx it isn't going away, especially that Microsoft have now End_of_Life'd 2003 in a move to boost the uptake of Office 2007, which means we need to be pragmatic and come up with a workable solution, which in the case of the Mac, seems to be Neo office. Microsoft's own import filter for the Mac just barfs on my machine but Neo office imports neatly, graphics and all, and lets you export the document in various useful formats.

Making 2003 EoL is of course also going to be a nightmare for multi-platorm sites as docX will start to spread through their windows fleet in a near viral manner causing mayhem to the non-Windows installed base. Sites with large numbers of windows machines will experience a similar problem due to the financial hit upgrading everyone at once will cause. Open Office as a corporate office suite? Reads all your legacy documents just fine. Only problem is that Open Office isn't really integrated into Aqua on the Mac, even though they're working on this.

So for now Neo Office is your friend if you have Macs on site. Does what Open Office does and handles docX to boot. One quirk though is Safari recognises that docx, like odf and like the good old open office format is a zip based format and helpfully unpacks the docx bundle for you. To work round this little problem I resorted to downloading the offending file using parallels and dragging the offending file from the windows desktop to the Mac desktop. Surely there's got to be something a tad more elegant ...

But this isn't a complete solution to the problem of submission to scientific journals I blogged about earlier. At least lets you edit .docx documents. Still it doesn't really handle the problem that basically the equation editor in Office 2007 doesn't use MathML or a compatible format.

And this is important as typesetting equations is hard, and computerised typsetters can have their own quirks. One of the reasons AmiPro was so popular with mathematical scientists when it appeared was that it had an equation editor that produced TeX code and yet was a proper onscreen word processor. Just as the only reason TeX has hung on is that typesetters understand it and what you put in is what you get out.

Now writing a program to parse markup isn't that hard (OK it is but it's doable), which means you can convert TeX, MathML or any markup based document to something a typesetting machine understands (SGML or whatever - one of the wierdest sights I ever so was a commercial printer that had a floor full of people in cubes editing raw SGML in vi on Macs to feed it into a typesetter and fix any conversion problems).

The other key point is that if the document is in a known, or well understood format you can always convert it to something else. docX isn't, the specification is owned by Microsoft and they can tweak it to fix problems which means that you get creep, which is a document conversion engineer's nightmare. ODF and the other open formats have specification documents which you can refer to. Adobe have published detailed specification documents for pdf to allow you to write your own pdf export utilities meaning the format is open in the sense that the knowledge on how to parse it is publically available.

All good. As the world's gone digital, lots of documents, research findings, whatever only exist in electronic form. Yes, people may have printed copies scattered round their offices, in the same way they used to have offprints, but there are no catalogued, findable, non-digital copies.

If the document is in a format that can't be read it might as well be dead, or written in linear B, maya, tokharian, or something equally obscure. If the format's known we can always access the knowledge. And fundamentally that's why docX is a problem. It doesn't follow open, described, standards so there's no guarantee of future access, or when we open a docX document created with Word 2007 we'll see exactly the same document when in 2012 we open it with word 2011

[Addendum: In this discussion I'm ignoring the very similar problems caused by Excel's new xlsx format in Office 2007, but that doesn't mean they're not out there]

Monday, 18 June 2007

keyboards and dishwashers ...

I eat lunch over my keyboard (one of my less appealing habits). Every so often I have to invert it and shake the crap out and I've already written off one keyboard with my unsavory habits.

So I've always been interested in ways of keeping keyboards clean especially ones in public access labs that get kind of yucky after a year's greasy fingered students have done their worst.

Most cleaning solutions include expensive products plus the employment of cleaning staff to come round and do the cleaning. So given that most of the waste in a keyboard is skin, grease and food waste I've always wondered if you could run a keyboard through a dishwasher and then dry it off with some water displacement chemical, eg WD40. Now someone at NPR's tried exactly that. And it does seem to work. Eevn if it probably invalidates the warranty and risks doing damage to the keyboard. But there seems to be all sorts of FUD about doing it. But then when a basic USB keyboard costs ten to fifteen bucks, whats the risk?

If your keybord works afterwards, you've saved $10. If it doesn't your no worse off than you were ...

Sunday, 10 June 2007

[canberra] Magpies after nest material already ...

I have no idea if it's global warming or not, but this morning, when I went out to pick up the paper from the drive there were a bunch of magpies banging about in a eucalypt looking for all the world that they were looking for nesting material.

Given we've had two days of high winds they could be wanting to repair nests, but it's not yet mid winter and they're turning to nest building in Canberra.

Just to add to the conundrum the azaleas and even the jonquils (minature daffodils) are flowering in our garden ....

Wednesday, 6 June 2007

Desmond Morris, a liking for red, and blondes ...

Now we could get all religious and start talking about Adam and Eve and red apples, or indeed do the Desmond Morris 'Naked Ape' thing about red lips and red labia, but it still remains that the evolution of human colouration is an interesting topic.

And blondeness is another aspect of human colouration that looks like an odd mutation. I first bloged about this over a year ago on my other blog: here's a sligly updated version:

BR>Blondes ...
posted Tue, 28 Feb 2006 15:56:02 -0800

I've often wondered about the evolution of blonde hair. It really would have to have had an evolutionary advantage to become common, and anyway, why is it found only (near enough) in populations whose ancestors lived in northwest europe.

Now blonde is a funny set of mutations, plae skin, blue eyes, yellow hair - definitely a bit of a freaky mutation.

Other populations who love in northwest Europe have the pale skin mutation - your classic celtic beauty with milk white skin is probably the result fo a selection for a population that makes the most of the available sunlight to make vitamin D. Very sensible in a population where it's cloudy and rains a lot, and consequently not a lot of sunlight.

I'm going to gues that you don't see such a mutation in northern Japan because they get more UV, eeven if it rains a lot.

But why blonde - a freaky looking mutation that makes them look like a different species.

Peter Frost, a Canadian anthropologist has been wondering about the same thing and has come up witht he theroy that blonde eveloved to make striking look people who were more sexually desirable to our neolithing ancsetors, and so blondes, as well has having more fun, got more food and ended up being more successful at reproducing themselves.

Studies of north west european populations show that there's a welter of variants of hair colour and the mutations of the three hair colour genes date back to the end of the ice age, say 11,000 years ago, when the population was small and the mutation could spread quickly among particular groups.

There's an updated article in the London Times on his research.

More generally Peter Frost seems to have been doing a lot of work on this area and has writtern a book on the evolution of lighter skin colour in humans, otherwise known as 'why aren't we all brownish coloured?'

How good the research is I don't know but it appears to have been accepted for publication in a respectable journal or two so it seems plausible. Even if it's wrong it's an interesting idea.

postscript - aboriginal children in australia often have blonde hair. Nineteenth centrury romantics used to say that this was from shipwrecked Dutch sailors in the sixteenth century, never mind that the aborigines don't have any stories about this while they've lots of stories about other things. What evolutionary advantage would blonde hair give them?

Primate colour vision: was fruit or sex the evolutionary driver?

A long time ago, I used to do research in animal behaviour. And one of the questions that stayed with me was why did promates evolve colour vision. 3D is easy, if you climb about in trees a lot having 3D helps you avoid killing yourself, but colour? Yes you need to know when fruit are ripe but an augmented monochrome vsion would do that as long as there was contrast with the surrounding vegetation.

And then there's birds. A long time ago I got interested in this topic, but really through looking at the evolution of berry colour and foraging strategy in birds.

Basically my question then was 'why are so many berries red' which given that a lot of birds perceive different colours to us is slightly
wierd. And of course 'why are those which are not red blue-black?' Which is kind of interesting as while read are easy for us ape derived beings to see, blue black is markedly less so (and why do some go through a red phase of ripening?)

Certainly birds rapidly learn to associate all sorts of colours with either acceptable or distatstful food objects so there's some
flexibility there, it's not as if the red preference is hard wired. I've done a little proof of concept experiment on this and Ian Soane did a much better on one on 'Why are distasteful prey not cryptic?'.

Fruit bats are also kind of interesting. Like primates they have binocular colour vision and eat fruit and live in a tropical area with
no real seasonality meaning that brightly coloured (ripe) fruit ares something they should be looking for if they want to live well.

So, my question is, which came first? Fruit colour to denote ripeness or colour vision to detect ripeness. And if the former why is our
colour vision so good at detecting variations when really all that's needed is a detction system that says 'fit to eat (or not)'.

And some people at Ohio University may have the answer, or part of it at least. (There's also a fuller report here.)

A preference for red made us evolve a better response for red coloration. Which would also mean that red coloured fruits would stand out more and be selected. (Which is good if you're a fruit, as you get eaten and your seed shat out somewhere else with a nice pat of fertiliser to start some more trees).

But did the red preference come first as a response to the availability of red fruit because birds (and fruit bats) had a preference for them (our augmented redness detector does the enhanced monochrome thing)? And why the hell do we havd difficulty with blue-black. Was red just simply good enough?

Monday, 4 June 2007

What do people really really want?

What do people really really want?

If you run a SoE (standard operating environment) you spend a certain amount to time debating what people need in a the way of a base applcation install.

Usually it comes down to:

  • a text processing application (that interworks with Word [.doc]
  • a spreadsheet that can handle excel files
  • a presetation viewer that can display powerpoints
  • a web browser that supports javascript
  • various viewers and plugins including something to deal with pdf files
  • a mail client (with or without calendaring functions - people swing both ways on that)

and that's about it. You can build a mac based environment. You can build a linux based environment. You could build a thin client environment that does it. You could do it through X-Windows, basically there's more than one way to skin this cat.

But from bitter experience I can tell you that you end up using Microsoft products.


Not because they're cheaper, or necessarily better, though some are pretty slick, but because Microsoft have convinced the 97% of the populace that don't do this for a job that they put the Word in Wordprocessing and that you can't possibly use anything else.

Now along comes a geek with an old 1980's Mac and a super duper up to the minute dual core AMD box who does a comparison between various applications on the Mac and his hot box. Now it's really a bit of fun and not that valid a comparison, but it does beg the question 'how much functionality do you really need?'

And with my retro/recycled computing hat on I've got to say I concur - not that much and not tat much need for expensive bloat ware. The problem is however user education. Because Excel is effctively a synonym for spreadsheet users are locked into the mindset of having to use Excel. And in a sense who can blame them - the shelves at Borders groan with 'How to do it with Excel' books but bugger all in the way of 'Open Office Calc for fun and profit'.

And that's it. People don't care what they have as long as they can get ther job done. People know thay can get the job done with Excel, and know there's a whole backup in terms of training, self help books, web forums and the rest. It's the comfort thing. And a support website run by geeks who no nothing and care less about muti currency cost accounting doesn't cut it - peopel want the reassurance factor, they don't want to be heros, especially where their job's concerned ...

DocX is a nightmare ...

DocX is a nightmare. Work in a site where most people have Office 2003 or else Office 2004 for the Mac and we have a scenario where people can't handle documents from Office 2007 users, which is a bit bad when people collaborating on documents expect to share and revise them.

Now there are converters for the Mac and the PC, but and people working with Office 2007 can save document in the 'old' .doc format. A pain but workable you might think.

But then comes the news that Office 2007's .doc export mode isn't really .doc, well certainly as far as mathematical equations are concerned and as a consequence some journal publishers are refusing Office 2007 files.

Now given that people live and die by journal submissions and citations this is a fairly major problem. Suddenly TeX starts looking attractive, or perhaps some othe common standard such as ODF. (Mind you I'll bet some publishers can't handle that either). This also has implications for the long term storage/archiving of documents in a revisable format - having your equations as a bitmap isn't the best ....

Wednesday, 16 May 2007

What pictures of naked people tell us about tagging

Flickr is a wonderful tool for studying tagging/folksonomies - as pictures are well pictures it's only by looking at tags that we can find content. Now one of the more common (85,000 plus entries) tags is 'naked'. Search for it and you get a whole range of images through amateur cheesecake shots of teenage girls with no clothes to pictures of mole rats by way of a whole range of pictures including an overweight young woman using flickr in the nude - incidentally confirming my predjudice that America is a deeply wierd place.

Anyway the four most common sorts of pictures are:

  • pictures of young women not wearing any clothes

  • pictures of young children playing on the beach/in the yard

  • participants in the world naked bike ride

  • participants in a Japanese religious festival

Clearly naked meant not wearing clothing to everyone who posted these pictures and tagged them that way. That's what we would expect where he meaning of the word is well known in English.

However the tacit metadata (or the associations) of the tag for the various groups were different, ie the way people thought about naked was different. To some it was associated with childhood and innocence. To others it had a sexual dimension, others solidarity and shared action.

And that's the point. In a controlled vocabulary we might distinguish between naked, nude and unclothed depending on context. In a folksonomy that's not the case unless we have a degree of common understanding where to use which synonym to more closely convey meaning.

A folksonomy implies a degree of commonality. For a small group of people working on a shared purpose that's probably fair. For a large random group of people that's not the case, even in such an apparently simple case as pictures of people who aren't wearing clothing.

Where there is no commonality here is no tacit controlled vocabulary and hence you get different classes of images tagged the same, meaning that the tags lose their value as a discriminant.

Singapore airlines provides star office on flights

Check this out - Singapore Airlines now provides Star Office as part of their inflight entertainment system.

Makes sense - a lot of airline entertainment systems run on linux (certainly Malaysian's does - there's something about seeing it reboot when you're 30,000 ft over Bali) so Star Office would be the easiest to provide in a netorked environment

Tuesday, 15 May 2007

Flickr, tags, folksonomies and the logic of crowds ...

Some time ago I went to a presentation on library technology in which all sorts of people started getting all enthusiastic about tagging and ho you could get students to rate courses, content, modules background reading etc etc.

Won't work. Just won't. People are first of all selfish, and there's enough people to skew the results of a small group. The whole wisdom of crowds thing depends on having a large enough group to have a marked central tendency so that the mavericks and the oddities cancel out. That's the theory behind opinion polls. Unfortunately you probably won't get a thousand people tagging eighteenth century novels in English 101 and a damn sight fewer tagging middle english love poems. And anyway why should they - what's in it for them?

For once this isn't cynicism on my part. I came across three articles that read together basically tell the same story, and they're based on empirical research rather than predjudice:

So all these ideas of creating a folksonomy just don't work. Anyway those who control access to knowledge may have an opinion about this - tacit metadata as per a post of mine a couple oy years ago:

Tacit metadata
posted Mon, 30 May 2005 10:49:39 -0700

Went to an interesting seminar on metadata at the ANU on Friday by Matthew Allen from Curtin on Friday,

His basic thesei is that most metadata contains an implicit categoristaion model and that the model is quite rigid. Most formal metadata models are highly prescriptive with the use of controlled vocabularies etc implying a particulr view of how data is organised and categorised.

Formal metadata models are supposed to make explicit what is implicit, but actually it is more complex than that.

His point was that we all knowe that Journal X is more prestigous than Journal Y, or that such and such a university has a better reputation than another in a particular field. Access to thhis knowledge is controlled by leading practioners who impart knowledge over the years by an initiation ritual.

At this point I was struck by the immediate resemeblance to indiegeous knowledge systems – for initation rituals read graduate scholarship and for leading researchers read senior old men – ie there are people who have position because they are thought to hold knowledge of value and control access.

(As an aside in the Arts and Humanities this is based on perception and not by some quasi objective ranking – eg science citation rankings – as in Sciences. Which leads to the question of when does reasearch stop being an interesting sysntesis of ideas in a dicursive conversation – otherwise known as plausible bullshit – and become part of human knowledge. I've often wondered this about the arts)

To return to the seminar.

Digitisation has created a vast demand for metadata categorisation, such that we could imagine that the didgitisation process could never be completed. Equally this 'objective' categorisation would eventually overwhelm researchers as any online search would produce a vast number of results.

We need some way to interpret the results. To an extent we rely on tacit metadata for an implicit ranking of the value of each results.

One apporach to side step this may be to use a folksonomy style approach where practioners label results – this would use an implicit controlled vocaulary and would build a collection of resources within a particular field of knowledge – the more ranked it is by scholars in the field the more accurate the description would be and the greater the index of value of the resource – allowing the tacit to be made explicit.

Interestingly the NLA/Arrow project are encouraging people to add their own folksonomy type terms to any documents lodged.

Also struck by the possible relevance to indigenous knowledge projects and the means of solictiing knowledge by allowing people with in the community to annotate objects – and the annotations then contain metadata and knowledge.

So tags and folksonomies use the logic of crowds to create an implicit controlled vocabulary to describe the object, and if the same people tag many objects we end up with a set of common words and terms. Trouble is, and I'm repeating myself here, you need a critical mass, elsewise you end up with shit as one of the key descriptors - which maybe critically appropriate but doesn't help decide on the relevance of the material ...

Thursday, 10 May 2007

Citrix, SGD and compression technologies

Had a lightbulb moment this morning - went to a presentation from riverbed this morning on their wan optimisaation kit.

Now a problem we have been grappling with is this:

Students do not necessarily use university provided computers these days, by preference they use their own to access university facilities. This in part because most of them have part time jobs and can't just drop into a computer lab for a couple of hours any more.

So the solution is to provide them access to a standard computing environment.

One way is that we provide a classic thin client environment using Citrix or Sun global desktop and that means we provide the apps, the disk and the execution space. SGD and Citrix have a low demand and run on anything (more or less) and make little or no assumptions about the line speed and because it's lightweight as far as the end machine is concerned, don't impact heavily on the performance of the machine.

Using VMPlayer and a pre rolled virtual desktop makes the execution happen locally and comes with a whole bag of assumptions about the architecture of the machine and the amount of grunt available. Coupled to this we can't predict the line speed so accessing remote documents might just suck. Even using compression technology like the riverbed desktop client (or the bluecoat one) tends to make assumptions about the amount of grunt and resources available locally.

In a corporate environment where you control the environment you can make the environment predictable by ensuring everyone has adequate hardware.

Not in a university. Students have everything from super sexy MacBooks through boring but adequate Dell or Acer laptops (or indeed any of the nameless rebadged brands from office superstores - Medion anyone?) through to tatty old desktops running linux. all in all an evironment that is only predictable in its unpredictability.

Under these circumstances the only strategy to go for is the most lightweight lowest impact most multi platform solution, and that looks like Citris (or SGD)

shredding != security

If you havn't already seen it, check out this article in the Guardian about an interesting solution to put shredded Stasi files back together electronically.

(Also I can heartily recommend 'The Lives of Others' as a movie, especially if you like dark deep movies)

securely wiping disks

Recently there was some conversation on one of the lists I subscribe to about the best way to wipe disks. The biggest gripe was the amount of time it took. It's not really about time, it's about data security. Here's my two cents on the subject:

Wiping disks takes time. Disks can also contain potentially valuable information. Deciding how to wipe and what to wipe is a value judgement.

For most purposes something like DBAN will give you a wipe to a standard that will satisfy most auditors (it conforms to standards, standards are good, auditors have to cover their backsides too), and it has the added security of making sure that that credit card number in a cached really has gone. Important, as you never know where your disks end up. One time in Morrocco I saw a whole pile of second user disks (some still with vendor stickers on them suggesting they came from a large facility manager) on a market stall.

Occasionally, you (or your masters) want to be really certain the data is gone. I once worked on a project where we engaged a company to dispose of our hardware securely. This involved breaking down machines, zeroing any static ram and having the disks cut in half by a very large man with an even larger angle grinder. You then accompanied said man to a very hot furnace where you watched him put the bits of disk in the furnace and shut the door. That _was_ data disposal.

Wiping disks is about managing risk, not time

Friday, 4 May 2007

OLPC, Gmail, and the communication thing

Kids who get OLPC's get gmail accounts.

Kenya and Rwanda have bought google apps for their universities

Put it together and suddenly you have a whole lot of email access allowing people to ask questions.

Of course a lot of it will be flim flam but suddenly people in the bush have communication with people in the cities. Families are reconnected via webmail. School teachers can ask questions when they don't know things. Sudden;y this communication thing starts happening breaking down the isolation of the rural poor.

And that can only be a good thing.

Dr Nick's stalking horse

Last night I was surfing ebay for no good reason and happened across a gorgeous yellow, and I mean yellow 1970's typwriter made in Holland. And only $35. I lusted after it, because I'm (a) sad and (b) like retro things. However cute as it was I didn't bid for it, in the sure knowledge that my wife would kill me if we had to make space for such a retro device. A trained artist she firmly believes things like that belong in design museums, not lounge rooms.

And in a funny kind of way the gorgeous yellow typewriter is linked to Dr Nick's One Laptop per Child programme. A lot of people in developed countries want one because it's cute, looks good and is a talking point.

But it's also a stalking horse. People might start using these cutesy linux powered low cost boxes seriously as 'take anywhere' machines, and with links to things such as google docs provides lightweight cheap portable computing, just as various boxes like the Compaq Aero did 10 years ago.

Couple with some thinclient stuff and you're away.

And suddenly these cutesy boxes aren't so cutesy anymore, and repackaged in grey and silver start looking like business machines.

And that's what Microsoft is worried about with its $3 give away to education users in poor countries. An old operating system on second user computers in the third world isn't a corporate threat. A cheap low cost linux based platform in the first world is ....

Thursday, 3 May 2007

last week

windswept tree i
Originally uploaded by moncur_d.

Took last Friday off, went bushwalking up Guthega Trig, felt really relaxed and had all these interesting blog entries planned.

Then on Sunday got a call that both the web caches were dead, and it's been kinda downhill from then, fixes reports, meetings other fixes, so much so that all these really cool ideas have just gone out of my head.

Tuesday, 24 April 2007

remembrance of things past ...

we've had a burst of UK nostalgia about this being the 25th anniversary of the release of the sinclair spectrum.

And certainly it launched a generation of machine code programmers and hackers. I'll even admit to having a spectrum plus (not to mention a jupiter ace), even though I was paid to work with real computers.

But it's funny. Looking at the spectrum reminded of the QL, another sinclair computer, and one far more interesting for its implications, because Strathclyde University launched a plan to give/lend/lease QL's to students in 1986/87, something hat was pretty revolutionary at the time give that most university networks consisted of classic time sharing solutions and a few computers on slow serial lines.

Strathclyde's adoption of the QL, while doomed because they chose the wrong platform, was revolutionary in its realisation that computers were an adjunct to learning and not just a glorified calculator/typewriter replacement...

also posted on my journalspace blog

Tuesday, 3 April 2007

Linux and the Library of Congress ...

Interesting article in LinuxWorld ( on the US Library of Congress's use of opensource tools digitizing rare books/materials and also for making them accessible over the web.

Worth checking out if you're interested in this kind of thing ...

P2P Lockss and private solutions

In February I wrote about using P2P and Lockss to make a basic archiving solution.

Interestingly the Lockss people picked up on this post and sent me a nice email with some information on people who were also working on a similar solution based around a private lockss archive model.

However, I've got to have a mea culpa here - their email went off to my Yahoo account where there was a filter that was supposed to forward on email to a couple of different accounts depending on what it was - except it didn't work and the mail just sat in the inbox. As I don't actually check my yahoo inbox very often it sat there for an embarassingly long time. Basically shows you can be too much of a geek sometime ....

Easy neuf ...

Came across this little curiousity in today's herald tribune (see ). A major French ISP has started renting low cost information access devices running linux. I wonder how long it'll be before they do a google apps bundle?

Certainly it's an interesting application of the 'nearly a thin client' model to sell broadband, the inbuilt software is there only to make the box seem usable as a home computer. Of course you could buikd your own using damn small linux and boot it from usb drive or whatever

(More details at : if you can read French googling for easy neuf brings up a range of other links)

Friday, 30 March 2007

wizzy digital courier

Of course, someone's alreadh had the fidonet idea with wizzy digital courier, a uucp data transfer solution ...


I've come across this sort of thing before but the BBC has an interesting article about 'bus-net' whereby people access the web via an offline reader and content, plus outgoing emails etc are transferred via a cache on the bus. It's essentially a way of getting round one othe major aspects of communications in the third world - the old style fixed line telephone network isn't pervasive, but as a mobile phone infrastructure is simpler to build - masts, base stations and microwave links back to where ever fibre stretches to. (On reflection one of the problems with the three regions theory of phone use is just that - access is by mobile phone in much of the third world because that's all there is, and the use of mobile services in preference to internet is an artefact of this. Provide additional infrastructure (mesh networks for example ) and the dynamics change ...)

One possibility/enhancement would be to use something like the old Fidonet achitecture with low cost slow dialup links to get back to the mother host and grab required wep pages from the master cache with wget and do a simple mail exchange. Not totally copper free but possibly slightly more dynamic than a pure bus-net solution

What is also interesting in the BBC article is the reference to lack of content in local languages and how that's holding up adoption