Friday, 6 August 2010

Identifying who might have published what

One of the more interesting problems in trying to create a research community and identify linakges between researchers is trying to identify retrospectively who actually published what:

For example if you google for "moncur d" you get a lot of results in the first two or three pages that are to do with me and also this one:

Heather, P. – Moncur, D. (2001), Politics, Philosophy and Empire in the Fourth Century. Select Orations of Themistius (Translated Texts for Historians), Liverpool

which given (a) my interest in classical history, (b) the fact I used to work in the UK and (c) given that my name turns up in the Acknowledgement sections of a couple of publications you might suspect that I had a secret double life and that I had a hand in the above translation.

I didn't.

It's not me. I'll put my hand up for my very sparse list of publications but no more. People who work in computing in universities inevitably get involved in projects with data management issues, and most people are kind enough to acknowledge the assistance rendered, as in this marine biology example. However, as regards translations of Themistus, I have absolutely zero connection to the above or any related publication.

And of course sometimes people do have past lives - as shown by the acknowledgements in this abstract. Yes folks, I really was once a research physiologist!

However this does highlight an issue in trying to build retrospective social graphs based on people's research and publication record - how can you be sure that the person is who you think they are?

You can't.

There will inevitably be mistakes and inconsistencies, especially when working out linkages between past collaborators. The only real approach is to build your maps, and then run consistency checkers, check for outliers and then investigate the outliers.

Sometimes they'll be genuine one-off collaborations, sometimes they'll be erroneous.

As an approach this also solves what I term the 'Asian Researcher Problem'.

Let us say we have a bright post-graduate whose name is Wong Chung Ki. He realises his western friends have problems with his name so he starts by calling himself Charles Wong, and using the initials CK Wong. After a few years he moves to a Chinese speaking university in south east Asia and revert to publishing under his original name.

Here consistency checking the social graphs should help us because there should be a high degree of overlap in collaborators and we should be able to work out they're the same person, in just the same way that outliers and single points of connection are suspicious due to a low degree of overlap ...

1 comment:

tenthmedieval said...

Ironically, these are very much the problems I have distinguishing people in charter evidence in my medieval research... Is that MirĂ³ the same as the previous one? They've got land in the same place; but this one's a judge, and that one never uses such a title even when you'd expect it. But people don't, always... but Mir&oacute's a really common name... and so on.

The practical effect of all this is, as one of my Ph. D. markers said at my viva, is that we wind up drawing most of our conclusions from a set of people with quite silly names.