Won't work. Just won't. People are first of all selfish, and there's enough people to skew the results of a small group. The whole wisdom of crowds thing depends on having a large enough group to have a marked central tendency so that the mavericks and the oddities cancel out. That's the theory behind opinion polls. Unfortunately you probably won't get a thousand people tagging eighteenth century novels in English 101 and a damn sight fewer tagging middle english love poems. And anyway why should they - what's in it for them?
For once this isn't cynicism on my part. I came across three articles that read together basically tell the same story, and they're based on empirical research rather than predjudice:
- So what can we learn from Flickr
- The Del.icio.us Lesson
- When tags work and when they don't: Amazon and Library Thing
So all these ideas of creating a folksonomy just don't work. Anyway those who control access to knowledge may have an opinion about this - tacit metadata as per a post of mine a couple oy years ago:
posted Mon, 30 May 2005 10:49:39 -0700
Went to an interesting seminar on metadata at the ANU on Friday by Matthew Allen from Curtin on Friday,
His basic thesei is that most metadata contains an implicit categoristaion model and that the model is quite rigid. Most formal metadata models are highly prescriptive with the use of controlled vocabularies etc implying a particulr view of how data is organised and categorised.
Formal metadata models are supposed to make explicit what is implicit, but actually it is more complex than that.
His point was that we all knowe that Journal X is more prestigous than Journal Y, or that such and such a university has a better reputation than another in a particular field. Access to thhis knowledge is controlled by leading practioners who impart knowledge over the years by an initiation ritual.
At this point I was struck by the immediate resemeblance to indiegeous knowledge systems – for initation rituals read graduate scholarship and for leading researchers read senior old men – ie there are people who have position because they are thought to hold knowledge of value and control access.
(As an aside in the Arts and Humanities this is based on perception and not by some quasi objective ranking – eg science citation rankings – as in Sciences. Which leads to the question of when does reasearch stop being an interesting sysntesis of ideas in a dicursive conversation – otherwise known as plausible bullshit – and become part of human knowledge. I've often wondered this about the arts)
To return to the seminar.
Digitisation has created a vast demand for metadata categorisation, such that we could imagine that the didgitisation process could never be completed. Equally this 'objective' categorisation would eventually overwhelm researchers as any online search would produce a vast number of results.
We need some way to interpret the results. To an extent we rely on tacit metadata for an implicit ranking of the value of each results.
One apporach to side step this may be to use a folksonomy style approach where practioners label results – this would use an implicit controlled vocaulary and would build a collection of resources within a particular field of knowledge – the more ranked it is by scholars in the field the more accurate the description would be and the greater the index of value of the resource – allowing the tacit to be made explicit.
Interestingly the NLA/Arrow project are encouraging people to add their own folksonomy type terms to any documents lodged.
Also struck by the possible relevance to indigenous knowledge projects and the means of solictiing knowledge by allowing people with in the community to annotate objects – and the annotations then contain metadata and knowledge.
So tags and folksonomies use the logic of crowds to create an implicit controlled vocabulary to describe the object, and if the same people tag many objects we end up with a set of common words and terms. Trouble is, and I'm repeating myself here, you need a critical mass, elsewise you end up with shit as one of the key descriptors - which maybe critically appropriate but doesn't help decide on the relevance of the material ...