galsang/BIMPM-pytorch

About the vocabulary constructed on snli

Howardqlz opened this issue · 2 comments

I see this code:
self.TEXT.build_vocab(self.train, self.dev, self.test, vectors=GloVe(...))
As i know, we should constrcuct vocabulary only on trainset?

The code line means that we build an embedding matrix that can map any word in datasets (including dev and test in addition to training) to the corresponding word representation initialized with the pre-trained GloVe vector.
We can, of course, utilize the pre-trained vector for a word that is not included in the training set but appears in the test set, even though the vector would not be fine-tuned during training.

Hello, why does the code stop after running an epoch
Uploading 截屏2020-11-24 下午7.17.30.png…