SqueezeAILab/SqueezeLLM

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

PythonMIT

Issues

When I use SqueezeLLM to quantize the LLaMA2-13B model and test it, the speed is extremely slow.
#71 opened 6 months ago by zhangfzR
0
Can you update the version that can quant OPT family?
#70 opened 6 months ago by Chen-1031
0
how can I get the models of 0.45% sparsity by myself?
#69 opened 6 months ago by LiMa-cas
0
Why do LLaMA-2-7B have s0 quantized models, but no s5 and s45 sparsity quantized models?
#68 opened 7 months ago by Evane5cence
0
Further speeding up the quantization process
#67 opened 8 months ago by SyphonArch
0
Installation instructions did not lead to the local transformers version being selected, giving errors
#66 opened 8 months ago by RDouglasSharp
0
Support JAIS models
#65 opened 9 months ago by 7ossam81
0
Error encountered during execution of the SqueezeLLM tutorial
#64 opened 9 months ago by commmet-ahn
0
Dense-only quantization bit precision
#63 opened 10 months ago by akarkim
0
On A100 card, speed-up effect does not show up.
#51 opened a year ago by leocnj
2
D+S packing in vLLM seems buggy
#62 opened 10 months ago by MingLin-home
0
sample_weight is negative when running kmeans clustering
#61 opened 10 months ago by MingLin-home
0
How to run dense-sparse quantization the papar mentioned?
#53 opened a year ago by LuletterSoul
1
A question about LLaMA-2-7B and Mistral models only provide Dense-only (0%) quantized models
#56 opened a year ago by WeiMa01
0
Will It work in V100 GPU ?
#4 opened 2 years ago by Sravanth-k27
3
channel-wise quantization
#52 opened a year ago by SoyeonUm
0
Future plan for this project
#45 opened a year ago by tjtanaa
0
Vicuna-1.5?
#44 opened a year ago by mlinmg
0
How to save the quantized model with full weights?
#22 opened a year ago by ChrisHayduk
2
finetune SqueezeLLM
#20 opened a year ago by kiucho
1
quantisation implementation
#12 opened a year ago by huyphan168
9
access to quantisation code
#7 opened a year ago by ri938
2
Minor bug for --include_sparse
#39 opened a year ago by vuiseng9
0
Vicuna v1.3
#30 opened a year ago by nestordemeure
1
SpQR
#9 opened 2 years ago by Jeduh
1
Add 65B-q3 evaluation
#5 opened 2 years ago by ingenieroariel
1
Typos in the README.md
#6 opened 2 years ago by matteoguarrera
3