Cornell-RelaxML/QuIP
Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"
Python
Issues
- 1
will it support group quant
#14 opened by oreo0906 - 1
gptq reorder canceled in quip?
#13 opened by chenzx921020 - 1
Why only class Quant3Linear(nn.Module)? Does this work for 2bit, 3bit, and 4bit? I understand it is only for bit3 . Thanks
#11 opened by yanni-code - 1
- 0
Thank you
#9 opened by yanni-code - 1
support for llama
#8 opened by Ottovonxu - 1
How to use quantized model on inference
#4 opened by yachty66 - 2
Do we still need to store U, V for each W
#7 opened by xwuShirley - 1
- 1
group-size for quip
#6 opened by xwuShirley - 4
Evaluation on LLama
#2 opened by NicoNico6