Some pointer for paper: Evaluating Word Embedding Models: Methods and Experimental Results
As someitmes been asked for the information for doing all experiments, we provide a list of pointers for resources used in this paper.
Since the experiments are conducted with different programming languages and different environments. The links are provided below, which takes soem time working them out.
= = = = = =
The following are a comprehensive link for most of the programs we have used:
- Training Data: http://nlp.stanford.edu/data/WestburyLab.wikicorp.201004.txt.bz2
- word2vec: https://code.google.com/archive/p/word2vec/
- GloVe:https://nlp.stanford.edu/projects/glove/
- FastText: https://github.com/facebookresearch/fastText
- ngram2vec: https://github.com/zhezhaoa/ngram2vec
- dict2vec: https://github.com/tca19/dict2vec
- Word Similarity Evaluation: https://github.com/BinWang28/PVN-Post-Processing-of-word-representation-via-variance-normalization, https://github.com/BinWang28/EvalRank-Embedding-Evaluation
- QVEC evaluation: https://github.com/ytsvetko/qvec
- For Neural Machine Translation, you can get it from GitHub site: (https://github.com/OpenNMT/OpenNMT-py). We use perplexity evaluation only since it takes too much time to fully train an NMT model. And Europarl v8 dataset (French - English) for training.
- For Experiments on POS, Chunking and NER: (https://github.com/billy322/RepEval-2016).
= = = = = =