Held-out perplexity question

Question

Held-out perplexity question

broalantaps opened this issue 10 months ago · 3 comments

Hey bro, how exciting your work is! Thanks for your contribution. I have some confusion surround me：

The paper mentioned that Perplexity calculated by held-out last 2048 tokens. But it seems that you calculated the entire sequence's nll during the evaluation phase instead of fixing hthe last 2048 tokens. It would be highly appreciated if you could reply me, thanks!

Answer 1 · 2024-03-08T19:09:34.000Z

Thanks for your question! During evaluation, all metrics are computed with the compute_loss() function in subset_trainer.py. The loss on each segment is logged individually under the substep_{substep}-seg{i}-nll metric. When evaluating with different segments, you should compare the losses on the last segment for each configuration.

Does this answer your question? Otherwise let me know more details of the issue you're facing.

Answer 2 · 2024-03-09T01:20:58.000Z

Thanks for your answer! So you chose the last substep_{substep}-seg{i}-nll and recorded the perplexity, right?

Answer 3 · 2024-03-09T09:37:56.000Z

Yes, that's right.