AutoGPTQ/AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

PythonMIT

Pinned issues

The Path to v1.0.0

#348 opened a year ago by PanQiWei

Open2

Issues

How to gather all quantized weights after quantization with AutoGPTQ?
#698 opened 7 months ago by yaldashbz
1
auto_gptq-0.7.1.tar.gz has inconsistent version: expected '0.7.1', but metadata has '0.7.1+cu121
#730 opened 4 months ago by q5sys
9
The error message that appears when I set use_marlin=True
#722 opened 5 months ago by bulaikexiansheng
3
[BUG] Transformer Regression: Llama3.1 - tensors on two devices
#729 opened 3 months ago by JeevanBhoot
7
Support InternVL2's quant now?
#732 opened 4 months ago by Jeremy-J-J
2
[BUG]Language model does not see CUDA when pycharm and pytorch sees it, what could be the problem?
#725 opened 5 months ago by GoonerTim
3
nsamples zero only once in the middle - division by zero
#736 opened 3 months ago by den-run-ai
1
[BUG] Not able to install on Ubuntu 22.04 (subprocess-exited-with-error )
#680 opened 8 months ago by mishraaditya595
4
The inference speed is very slow after the model is quantized.
#696 opened 4 months ago by chenyunsai
1
[BUG]TypeError: qwen2_moe isn't supported yet.
#724 opened 4 months ago by JiaXinLI98
2
[BUG]deepseek_v2 isn't supported yet
#733 opened 4 months ago by jli943
1
[BUG] Cannot install from source
#669 opened 8 months ago by victoryeo
2
New release?
#727 opened 5 months ago by stoical07
0
[FEATURE] ChatGLM Support Added
#684 opened 7 months ago by Qubitium
2
[BUG]Unable to quantize Falcon-7b
#726 opened 5 months ago by wchen61
1
[FEATURE] Quantize Embedding
#723 opened 5 months ago by RanchiZhao
1
the output of 8bit Mixtral-8x7B-v0.1-GPTQ is strange
#720 opened 5 months ago by JustQJ
0
I encountered a problem while determining the number of proofreading datasets
#717 opened 5 months ago by jiangchengchengark
0
How to quantize models with Triton v2
#712 opened 6 months ago by Coco58323
1
[BUG] The paths for the custom_bwd and custom_fwd methods have changed
#715 opened 6 months ago by russellgeum
0
[FEATURE] Quantization of internlm/internlm-xcomposer2-4khd-7b to 4bit?
#716 opened 6 months ago by zhuraromdev
0
Can't get my CUDA_VERSION after I set CUDA_VERSION environment variable
#702 opened 7 months ago by LinghuC2333
1
CUDA extension not installed
#694 opened 7 months ago by yaldashbz
4
Target modules [] not found in the base model. Please check the target modules and try again.
#667 opened 8 months ago by RMimo
1
torch._C._LinAlgError: linalg.cholesky: The factorization could not be completed because the input is not positive-definite (the leading minor of order 1 is not positive-definite).[BUG]
#708 opened 6 months ago by Yanyao-Guan-gzu
1
[FEATURE] Enhance pack model speed
#709 opened 6 months ago by DeJoker
1
[BUG] How can I cut the final single .safetensor into pieces smaller than 4GB for each?
#707 opened 6 months ago by fzp0424
2
[BUG] ARM installation error
#665 opened 6 months ago by DavidePaglieri
0
How to quantize an inherited linear layer?
#704 opened 6 months ago by nzomi
1
[BUG]Qwen1.5-32B int8量化后推理异常
#705 opened 6 months ago by flyerming
0
[FEATURE] pass in attention mask and input ids for calibration dataset on huggingface's GPTQconfig
#697 opened 6 months ago by RanchiZhao
1
[FEATURE] Why del new_example["labels"]
#703 opened 6 months ago by RanchiZhao
0
Buffers in Marlin setting
#699 opened 7 months ago by yaldashbz
0
Add support for Gemma2 models.
#700 opened 7 months ago by markoarnauto
0
[BUG]safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer
#661 opened 7 months ago by chuangzhidan
1
[BUG] Quantitative model Yi-1.5-9b-16K does not produce text output.
#677 opened 8 months ago by maxin9966
1
[BUG] do not install auto-gpt for 910B in aarch
#692 opened 7 months ago by luoan7248
0
[Issue] wheel package for CUDA 12.1
#685 opened 7 months ago by sudhanshu746
0
[BUG] Following the quant_with_alpaca.py example but keep getting "You shouldn't move a model that is dispatched using accelerate hooks." and the model is never saved.
#670 opened 8 months ago by murtaza-nasir
2
[BUG]
#681 opened 7 months ago by yuyu990116
0
How to install auto-gptq in GCC 8.5.0 environment?
#678 opened 7 months ago by StephenSX66
0
How to get a dequantized model?
#679 opened 8 months ago by mxjmtxrm
0
[FEATURE] ADD SUPPORT DeepSeek-V2
#664 opened 8 months ago by Xu-Chen
1
[FEATURE] Added code support to 5,6,7 bits quantization can you please add me as contributor I will create a new pull request
#675 opened 8 months ago by thoorpukarnakar
4
Question about data shape difference between quantization and forward
#674 opened 8 months ago by sleepwalker2017
0
How to select between different kernels?
#673 opened 8 months ago by sleepwalker2017
0
[FEATURE] Add marlin24 support
#672 opened 8 months ago by Qubitium
0
[FEATURE] Models that support MOE do GPTQ
#671 opened 8 months ago by CallmeZhangChenchen
0
[BUG] ROCm installation and building broken
#666 opened 8 months ago by xangelix
0
[Question] Differences in quantization logic compared to AWQ
#663 opened 8 months ago by wenhuach21
0