A suite for creating & evaluating phrasal embeddings via the ECO
model based on Efficient, Compositional, Order-Sensitive n-gram Embeddings (EACL 2017).
The Skip-Embeddings and English Wikipedia used to generate the skip-embeddings can be downloaded here.
evaluations
: data and scripts for different evaluation tasks to evaluate the embeddings.skipEmbeds
: the script used to generate theECO Skip-Embeddings
and vanillaword2vec
embeddings. ⋅⋅1. We extended Debora Sujono's python version of word2vec. ⋅⋅2. We also have a local C version that is not tested. ⋅⋅3. The embeddings used in the paper and released were created using the python version.
If you use our changes to the code or our skip-embeddings, please cite us:
@inproceedings{Poliak:2017EACL,
Title = {Efficient, Compositional, Order-sensitive n-gram Embeddings},
Author = {Poliak, Adam and Rastogi, Pushpendre and Martin, M. Patrick and Van Durme, Benjamin},
booktitle = {Proceedings of the 15th Conference of the European Chapter of the
Association for Computational Linguistics},
Year = {2017},
Publisher = {Association for Computational Linguistics},
location = {Valencia, Spain}
}
There is a typo in equations (6) and (7) in the EACL proceedings. The version found at https://www.cs.jhu.edu/~apoliak1/papers/ECO--EACL-2017.pdf has the correct equations.