
PyTorch implementation of the paper - Dynamic Meta-Embeddings for Improved Sentence Representations, EMNLP 2018

This repository contains my PyTorch implementation of the paper:

Dynamic Meta-Embeddings for Improved Sentence Representations
Douwe Kiela, Changhan Wang and Kyunghyun Cho
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
While one of the first steps in many NLP systems is selecting what pre-trained word embeddings to use, we argue that such a step is better left for neural networks to figure out by themselves. To that end, we introduce dynamic meta-embeddings, a simple yet effective method for the supervised learning of embedding ensembles, which leads to state of-the-art performance within the same model class on a variety of tasks. We subsequently show how the technique can be used to shed new light on the usage of word embeddings in NLP systems.


  • Clone this repository and install the necessary requirements. Do:

    git clone https://github.com/kushalchauhan98/dynamic-meta-embeddings.git
    cd dynamic-meta-embeddings
    pip install -r requirements.txt
  • Train the model. The training script will take care of downloading the datasets and pre-trained word embeddings. Do:

    python main.py [arguments...]

    The arguments are listed as follows:

      -h, --help            show this help message and exit
      --task {snli}         Name of task (default: snli)
      --embedder {single,concat,dme,cdme}
                            Type of embedder to use (default: cdme)
      --proj_dim PROJ_DIM   Dimension to which the embeddings should be projected to (default: 256)
      --emb_dropout EMB_DROPOUT
                            Dropout probablity for the Embedding layer (default: 0.2)
      --vectors {charngram.100d,fasttext.en.300d,fasttext.simple.300d,glove.42B.300d,glove.840B.300d,
                            Pretrained word embeddings to use (default:
                            ['glove.840B.300d', 'crawl-300d-2M'])
      --rnn_dim RNN_DIM     No. of hidden units in the sentence encoder LSTM (default: 512)
      --fc_dim FC_DIM       No. of hidden units in the Classifier (default: 1024)
      --clf_dropout CLF_DROPOUT
                            Dropout probablity for the Classifier (default: 0.2)
      --n_classes N_CLASSES
                            No. of classes in dataset (default: 3)
      --bs BS               Batch size (default: 64)
      --lr LR               Learning Rate (default: 0.0004)
      --epochs EPOCHS       No. of epochs (default: 50)
      --device {cuda,cpu}   Device to use (default: cuda)

    For example:

    python main.py --task snli \
    	--embedder dme \
    	--vectors glove.840B.300d crawl-300d-2M \
    	--emb_dropout 0 \
    	--clf_dropout 0 \
    	--lr 0.000003 \
    	--epochs 5

    Without any arguments, the script will train on the SNLI dataset using Contextual Dynamic Meta-Embeddings with the network architecture and parameters as mentioned in the paper.