Held-out perplexity question
broalantaps opened this issue · 3 comments
broalantaps commented
Hey bro, how exciting your work is! Thanks for your contribution. I have some confusion surround me:
The paper mentioned that Perplexity calculated by held-out last 2048 tokens. But it seems that you calculated the entire sequence's nll during the evaluation phase instead of fixing hthe last 2048 tokens. It would be highly appreciated if you could reply me, thanks!
AlexChvl commented
Thanks for your question! During evaluation, all metrics are computed with the compute_loss()
function in subset_trainer.py
. The loss on each segment is logged individually under the substep_{substep}-seg{i}-nll
metric. When evaluating with different segments, you should compare the losses on the last segment for each configuration.
Does this answer your question? Otherwise let me know more details of the issue you're facing.
broalantaps commented
Thanks for your answer! So you chose the last substep_{substep}-seg{i}-nll and recorded the perplexity, right?
AlexChvl commented
Yes, that's right.