Implement F1 score + accuracy
Opened this issue · 0 comments
tobiasvanderwerff commented
Quote from the paper:`
We evaluate the system performance on slot filling using F1 score, and the performance on intent detection using classification error rate.
To make a fair comparison between our results and those from the paper, we should implement the F1 score for evaluating slot filling performance, and the accuracy (correct predictions / all predictions) for intent classification.