A package for generating and exploring old-school word embeddings.
The package is under development and it is not suitable for serious work right now.
- Free software: MIT license
- Documentation (link is broken): https://svd-algebra.readthedocs.io.
- A model trained on the English wikipedia dump along with the accompanying vocabulary can be downloaded here https://drive.google.com/open?id=1C1o53_6S4bS-Lw3wBBvaP9011tajZrq1
Coming soon!
- I'd like to learn Cython and linalg, that's the main reason
- For most NLP task, a PMI matrix with some SVD is enough, read Chris Moody's Stop using word2vec post https://multithreaded.stitchfix.com/blog/2017/10/18/stop-using-word2vec/
- A well-parametrised old-school embedding is as good as a neural one according to this https://rare-technologies.com/making-sense-of-word2vec/
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.