prdwb/bert_hae

What does the n_best_size parameter mean?

Opened this issue · 3 comments

thank you!

prdwb commented

The model predicts the probabilities of each token in the passage being the begin and end tokens of the answer span. When we combine a begin token and an end token, we get an answer span. n_best_size is how many top combinations of begin and end tokens we consider for the final output.

thank you a lot.
If, for an example, the length of the text is greater than the maximum length, then a sliding window will be used. So, how to deal with it if the answer is not in this sliding window.

prdwb commented

For training, if a document chunk does not contain an annotation, we throw it out, since there is nothing to predict.