Cornell-RelaxML/QuIP

Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"

Python

Issues

will it support group quant
#14 opened 2 months ago by oreo0906
1
gptq reorder canceled in quip?
#13 opened 6 months ago by chenzx921020
1
Why only class Quant3Linear(nn.Module)? Does this work for 2bit, 3bit, and 4bit? I understand it is only for bit3 . Thanks
#11 opened 10 months ago by yanni-code
1
4bit or 2bit quantization model saved is fp 16, I am confused。Thank you
#10 opened 10 months ago by yanni-code
1
Thank you
#9 opened a year ago by yanni-code
0
support for llama
#8 opened a year ago by Ottovonxu
1
How to use quantized model on inference
#4 opened a year ago by yachty66
1
Do we still need to store U, V for each W
#7 opened a year ago by xwuShirley
2
error computing is not correct of covariant of pre and post process
#5 opened a year ago by ozzzp
1
group-size for quip
#6 opened a year ago by xwuShirley
1
Evaluation on LLama
#2 opened a year ago by NicoNico6
4