Monday, 13 August 2012

W G Burn Murdoch meets topic modelling ...


quite some time ago I blogged about W G Burn Murdoch's from Edinburgh to Burmah chronicling a trip he made in 1908.

As well as being an enjoyable bit of Edwardian travel writing one thing that struck me was the writer's developing sense of Scottishness and also his sympathy with the Burmese people and the annexation of Upper Burma.

At the time I suggseted that one could trace a scottishness meme through his friendship with William Spiers Bruce and the Scottish Antarctic Expedition.

So today I decided to test this out.

First of all I downloaded and installed the gui version of the mallet topic modelling software and fed the texts of both his books through it. Some beautiful fridge poetry resulted but not much of a hint of Scottishness.

Edinburgh to Burmah topics:

1.water sand great left grey soft burmah till evening top
2.home sea made good night indian things natives pretty dinner
3.light hot country burmese brown told thought royal figures faces
4.white black time morning dark board long put head steps
5.red trees back yellow small colours feel train flowers notes
6.people men man air women fish music chinese young coloured
7.day colour sun house full prince deck golden hair low
8.blue half gold sky miles big shore pass hand days
9.side feet work project high ladies native gutenberg pleasant make
10.river round green india place night line open east ground


Edinburgh to Antarctica topics:

1.great night boat seals feet doctor till found called hours
2.crown white man illustrations south put weather warm ship op
3.water snow grey left seal line hard option balaena thought
4.antarctic vo edinburgh round vols back crew svo fcp red
5.wind air work boats world mate sea top brought cabin
6.long men small blue islands sky birds penguins rev turned
7.sea made black life cr home half dark colour green
8.ice day days light whale north amp bergs whales sun
9.board ship good works lay heard cold pack making end
10.time land deck make head side skins mist hands sir


The second list is a little odd as  Burn Murdoch's Antarctica book is a non-proofread ocr version, which contains what are obviously font or word recognition errors.

So to sanity check what I was seeing I installed the ibm word cloud software and fed the books through that, not a hint of Scottishness standing out.

Edinburgh to Antarctica wordcloud


Edinburgh to Burmah wordcloud

Now I'm not about to rubbish topic modelling as a technique, however it possibly is not a complete substitute for critical reading. In his 1908 Burmah book I certainly got a sense of W.G's developing sense of Scottishness as opposed to Britishness and that this informed his feelings about Upper Burmah. It doesn't show up in these analyses.

And that I think is important. Applied to newspaper reports or scientific publications it quite clearly can pull out important themes. What it doesn't pull out is the subjective and impressionistic ...



No comments: