Data used for computing perplexity

Question

Data used for computing perplexity

liuyq123 opened this issue 2 years ago · 4 comments

Hi Gonzalo,

I'd like to compute perplexity to see if my retrained model is of the same quality as your model. But I don't have the data/mlm/windows/five_prime_UTR.test/512/128/seqs.txt file you used.
Can you tell me how to download it? Thank you!

Answer 1 · 2023-01-16T20:33:05.000Z

Hello, we didn't end up using that file. We computed perplexity on the validation_file argument in the training script (data/mlm/dataset/test/Arabidopsis_thaliana.test.512.256.parquet).

Answer 2 · 2023-01-16T20:58:12.000Z

Thank you!

Is the perplexity just the eval loss in weight & biases? But my eval loss started at 1.15, and I expected it to be bigger than the 3.01 reported in your paper.

Answer 3 · 2023-01-16T21:03:35.000Z

It's just $e^{\text{eval loss}}$. For example $e^{1.15}=3.16$. Have you tried running some final steps with a lower learning rate?

Answer 4 · 2023-01-16T21:16:09.000Z

I see, thank you!

I haven't yet. The parameters I'm using now are the same as the ones in train_512_convnet_only_athaliana.sh. I've finished 700,000 steps, and the loss is 1.12. I will try a lower learning rate later. Thank you!