I've always had an affection for Sir Gawayne the Green Knight, (first bit of real middle English I read) so I thought as final act of fiddling about with wordcloud I 'd feed the Guternberg version into the IBM wordcloud software just to see what came out
which neatly demonstrates the need for a proper middle english stopwords file. Hacking my original file to produce an extended though very incomplete file one gets something a little better:
which shows that one of the things we need to take this outside of playing with nineteenth and twentieth century English text is a set of agreed stopword files for analyses.
This would clearly also apply to analyses with other languages, be it Malay or Old Irish...
which neatly demonstrates the need for a proper middle english stopwords file. Hacking my original file to produce an extended though very incomplete file one gets something a little better:
which shows that one of the things we need to take this outside of playing with nineteenth and twentieth century English text is a set of agreed stopword files for analyses.
This would clearly also apply to analyses with other languages, be it Malay or Old Irish...
No comments:
Post a Comment