Word2Vec for Comparative Semantic Spaces

I’ve recently become interested in Word2Vec as a way to represent semantic relationships between words in a corpus. In particular, I’m interested in making comparisons between corpuses: how do different texts organize conceptsĀ differently? Here I attempt to sketch a theoretical basis for word2vec drawing from early structural linguistics and sociology. Then I examine some basic results from trainingĀ a word2vec model on the Gutenberg texts built into the nltk python library. Might this approach have utility for understanding how authors organize different concepts in a text?

