And finally,
for fun I fed the Gutenberg Collected works of Chaucer into the wordcloud software ...
this is actually quite interesting.
I didn't have a middle english stopwords file so of course we see that common forms of speech (ye, thou, thee, thy etc) predominate. So, I made myself a very simple supplementary stopwords file consisting of the obvious bits of middle english (thee, thy, thou, ye, eke, gan) in the wordcloud and then reran the generation process:
for fun I fed the Gutenberg Collected works of Chaucer into the wordcloud software ...
this is actually quite interesting.
I didn't have a middle english stopwords file so of course we see that common forms of speech (ye, thou, thee, thy etc) predominate. So, I made myself a very simple supplementary stopwords file consisting of the obvious bits of middle english (thee, thy, thou, ye, eke, gan) in the wordcloud and then reran the generation process:
which I think we can agree is possibly a bit better though it needs more work - for example quoth, hath, anon and may should probably be excluded.
Using an extended stopwords list one can come up with something like this:
which is possibly a more accurate model of Chaucer's drivers. I must say that I'm quietly impressed with the power of this to display the themes in a body of text ...
No comments:
Post a Comment