
Train/Decode speed comparation with CRF

Hello, your work is very great but I have a question about the speed comparation.
From your code published, when we use the CRF, in the file, the CRF layer only be used to calculate the loss in the neg_log_liklihood_loss function, but in the forward function, there is nothing with CRF layer. Did I miss something somewhere ? Hope hear your reply, thank you very much.

Hi, thanks for the kind words about the paper.
The current version does not support CRF baseline. This is because I forgot to add the baseline code before uploading. I will modify it as soon as possible.