Word2Vec models that have been pre-trained on large corpora are invaluable because they contain all of the semantic and contextual information in a lookup dictionary of only a few million words. They tend to perform well on synonym and analogy tests at around 300 dimensions, and can be applied to a number of machine learning applications. The challenge with these large models is that they take a long time to load into memory when your program starts and the lookup algorithms are intense to the point where you may not want to run them on your desktop computer. I’ve written a python library called word2vecserver that allows one to load a pre-trained model onto a server and use the client library to make requests for vector representations or analogy tests from another computer.
To use the library, download the pre-trained Google News file and load it into memory using Gensim.
I’ll add updates as I begin to use it in different contexts. Feel free to update as needed – if you make useful commits I’ll accept them!