Stuff, geeky stuff: reputation in a digital world

Just watched a stimulating presentation by Allan Rusbridger of the Guardian on the future on media and newspapers in particular (the lecture in full is available on the ABC 702 site and there's a nice summary of what he said about twitter on The Age).

Now what he's saying actually has resonances with what is happening to scholarly communication.

In the age of social media the gates to publication are no longer controlled by a group of older men who are journal publishers and who invite their friends to carry out peer review. That's not to say it didn't work quite well for the past 150 years, but we all know there have been incidences of nepotism and worse.

However, that's not the case anymore. These days are gone. More so in journalism than academia, but they're gone. The genie is out of the bottle.

Anyone can publish anything, and can publish the data and the analysis that they used to back it up.

And if it's interesting it will be picked up and will spread virally. Retweeted, linked to, cited and the rest. Without wanting to seem to be waving parts of my anatomy, a couple of posts of this blog have been picked up by a journalist working for the Wall Street Journal, and I've had follow up questions and comments from a range of reasonably reputable people.

Now if you get an email out of the blue asking for your opinion of X on the basis of something you wrote you do a few checks - such as googling the person concerned to see that they are who they say they are - and if they check out you probably put some effort into a more detailed reply than you otherwise might.

In other words you are assigning an implicit reputational score.

The same with twitter. You (usually) only follow people you find interesting (and/or witty). You assume people follow you for the same reason. Twitter's 'who to follow' suggestions works the same way by suggesting people followed by both those you follow and people who follow you. At best it can be frighteningly good at picking out the twitter personas of people you know professionally. In my experience, while it might suggest people that you don't want to follow for a variety of reasons, it very rarely comes up with oddball suggestions.

And what of course is happening is that you are building a web of trust. Not perfect but no worse than with people you meet at a conference.

So in essence you vouch for people and people vouch for you. A tacit version of ebay's scoring system for sellers and buyers. If 10 people say Fred is a good person to deal with he probably is. You don't know these people but you go on what they say because enough of them say the same thing.

Another such example is Boden a UK clothes website. Selling clothes, primarily to women, online is difficult as users want to know how it fits and hangs on them, not how it hangs on a skinny model at least ten years younger than the average age of the people using the site. Boden encourages users to post anonymous reviews of products. No one reading the review knows the age or shape of the person writing the review, but if there are a number of reviews all saying that the material is too shiny or the cut too tight, it's likely that there is a problem. Again the reviews are anonymous, but you go on what they say because enough of them say the same thing - essentially the same thing that goes on in opinion polling. Not totally accurate but close enough.

Translating this to scholarly publication, what this means is that if we assume that the people who follow or regularly read particular more academic blogs are a self selecting population of interested individuals we can then start to say that if they cite posts, either as links in articles they themselves write or as retweets, it suggests that the article has some worth, just as in the old days one would track citations in the science citations index to decide if a particular paper was worth following up on.

And by examining the cross links, the social graph, one can define the inner community and consequently identify the loonies. Basically they may cite you but no one in the group cites them.

So link analysis allows you to assign weight to posts by people you may not know. And this probably benefits less established scholars, as if they say interesting things it is likely to be picked up on and the set of bidirectional links established. This isn't particularly new - web metrics companies such as Alexa have been using the number of inbound links as a reputational index for a number of years, and Google, as we all know uses links in its page rank algorithm as part of its ranking of a site's likely relevance.

And of course it is possible to establish these reputational scores algorithmically. And we also lose the distorting effects of work that is published in a journal with a generally higher reputational score being ranked higher than work of equal significance being published in less well regarded journal.

For example if the editor of Nature was to ask me to rework this as an article for publication, I'd probably get a note of congratulation from one of the great and the good of the institution I work for. I doubt however if I would get one if the article ended up in the Australian Journal of Research in Information Science. Of course on the whole Nature chooses wisely and chooses articles of significance.

However, if it is the case that I write something that is published in an obscure journal, it is likely the article will be missed and treated as being of little significance, purely due to the journal of publication. Certainly bibliometric systems such as Socrates tend to weight publications on the basis of the journal of publication rather than just an assessment of worth.

And this is equally important with dataset citation. There are no journals. And if the dataset is a result of experimental or observational work the likelihood of its reuse will depend on the reputation of the research group that produced it. The same is true of literary corpus's. We trust the data more if we trust the people who produced it.

Reputational scoring based on the social graph of the author rather than raw citation rates increases the chance of innovative work being picked up on. And that is surely a good thing.

Stuff, geeky stuff

Tuesday, 23 November 2010

reputation in a digital world

No comments: