error when encountering unknown NER label
Opened this issue · 2 comments
a toy model crashes when encountering an unknown NER label.
To reproduce: run python3 -u train.py jsonnets/toyAMRAutomata.jsonnet -s example/toyAMRAutomataOutput/ -f --file-friendly-logging
on commit 1282115
on the unsupervised2020
branch.
According to allenai/allennlp#2147, crashing when encountering a label that is unseen is the intended behaviour as long as no OOV token (i.e. a token that says "i'm the OOV token") is in the vocabulary. My guess is that usually, such an OOV token gets added automatically, but not in this toy example.
Whether or not an OOV token is added is controlled by the vocabulary class: https://docs.allennlp.org/v0.9.0/api/allennlp.data.vocabulary.html#allennlp.data.vocabulary.Vocabulary. You can adjust this in the config file; there already is an entry for "vocabulary" in jsonnets/emnlp20/glove/AMR-2015.jsonnet
for example. Of course the OOV token embedding will be untrained.