Is there a way to turn off CRF and use a dense layer as output decoder?
mxc19912008 opened this issue · 5 comments
Hi,
Nice project!
I am trying to do an experiment that compares the performance if we turn off the CRF layer use a dense layer as output decoder.
`
#input size: size of lstm hidden states, output size: label size
linear = nn.Linear(lstm_scores.size()[2], label_size)
out = linear(lstm_scores)
out = F.log_softmax(out, dim=1)
#pick the index of largest for each word
decodeIdx = torch.argmax(inputs, dim=2)
return decodeIdx`
But the precision, recall and F1 score are really low, as 1.5 or something.
Any suggestion about how to implement this?
Thanks in advance!
By the way, it seems that the argmax picks random index each time. For example, "-o" label corresponds to a different index each time.
I will look into it and get back to you soon
I have committed new changes to the files.
You can check out what's changed in this commit: f7ef24a
Now you can simply turn off the CRF layer by adding in the command:
python3 trainer.py --use_crf_layer 0
default is 1
Feel free to open the issue again if you still run into any error.
---------Below is not necessary for you to read, it's just how I modify the code
Basically, you may miss the following details:
-
When you calculate the loss, you need to mask it out, which you can see the difference (f7ef24a#diff-64d896d851ee690a494e0a27969af1ac) here. I'm not sure if this step is compulsory but it's definitely correct to do so.
-
During the evaluation, we flip the prediction in the
eval.py
because of the Viterbi decoding. Now if we do not use Viterbi, we don't need to flip the prediction ineval.py
BTW, I can see the results can go up 90.5 on CoNLL-2003 dataset with Glove embeddings. It is still running though.
Thank you so much!