VILA-Lab/GBLM-Pruner

Results of LLaMA-2 are different from Wanda

Closed this issue · 2 comments

pprp commented
image image

For LLaMA-2, Wanda and GBLM-Pruner got different ppl. Any thought?

Hi, thanks for the question. The numbers are different because, in the Wanda paper, they computed the perplexity for LLaMA-2 using a sequence length of 4096 from Wiki-Text, whereas in GBLM-Pruner, the sequence length is 2048.

Thank you for your answers. 😄