spcl/QuaRot

Reproducing paper Table 8

Closed this issue · 1 comments

Thanks for sharing a nice code :)
I'm now trying to reproduce results in original paper but having problem with selecting right arguments.
Table 8: WikiText-2 Perplexity with Llama-7b model INT4 has perplexity of 6.10.

However, if I run below arguments I only get 6.313 and 6.348
Is there any example of arguments that I can reproduce the Table 8 ?

python main.py --model meta-llama/Llama-2-7b-hf --tasks wikitext2 --rotate --a_bits 4 --v_bits 4 --k_bits 4 --w_bits 4 --w_clip --k_asym --v_asym --bsz 4

python main.py --model meta-llama/Llama-2-7b-hf --tasks wikitext2 --rotate --a_bits 4 --v_bits 4 --k_bits 4 --w_bits 4 --w_clip--bsz 4

Hi @mjyun01

I think you need to add other clipping arguments (like --v_clip_ratio) and follow the numbers in the paper. They will improve the PPL a bit more so you can reproduce the results of the paper. See here