AutoGPTQ/AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

PythonMIT

Pinned issues

The Path to v1.0.0

#348 opened 9 months ago by PanQiWei

Open2

Issues

[BUG] Not able to install on Ubuntu 22.04 (subprocess-exited-with-error )
#680 opened 23 days ago by mishraaditya595
1
Llama-3 8B Instruct quantized to 8 Bit spits out gibberish in transformers `model.generate()` but works fine in vLLM?
#657 opened 2 months ago by davidgxue
6
[Issue] wheel package for CUDA 12.1
#685 opened 11 days ago by sudhanshu746
0
[FEATURE] ChatGLM Support Added
#684 opened 13 days ago by Qubitium
0
[BUG] Following the quant_with_alpaca.py example but keep getting "You shouldn't move a model that is dispatched using accelerate hooks." and the model is never saved.
#670 opened a month ago by murtaza-nasir
2
[BUG]
#681 opened 15 days ago by yuyu990116
0
How to install auto-gptq in GCC 8.5.0 environment?
#678 opened 18 days ago by StephenSX66
0
How to get a dequantized model?
#679 opened a month ago by mxjmtxrm
0
[FEATURE] ADD SUPPORT DeepSeek-V2
#664 opened a month ago by Xu-Chen
1
[BUG] Quantitative model Yi-1.5-9b-16K does not produce text output.
#677 opened a month ago by maxin9966
0
[FEATURE] Added code support to 5,6,7 bits quantization can you please add me as contributor I will create a new pull request
#675 opened a month ago by thoorpukarnakar
4
Question about data shape difference between quantization and forward
#674 opened a month ago by sleepwalker2017
0
How to select between different kernels?
#673 opened a month ago by sleepwalker2017
0
[DEPRECATION] Discussion on Fused attention and QiGEN
#655 opened 2 months ago by Qubitium
5
[FEATURE] Add marlin24 support
#672 opened a month ago by Qubitium
0
[FEATURE] Models that support MOE do GPTQ
#671 opened a month ago by CallmeZhangChenchen
0
[BUG] Can not save quantized model to disk: "you shouldn't move a model that is dispatched using accelerate hooks."
#630 opened 3 months ago by tattrongvu
1
[BUG] Cannot install from source
#669 opened a month ago by victoryeo
0
Target modules [] not found in the base model. Please check the target modules and try again.
#667 opened a month ago by RicardoHalak
0
What magnitude of avg loss indicates a relatively good result for a quantization model
#649 opened 2 months ago by ehuaa
6
[BUG] ROCm installation and building broken
#666 opened a month ago by xangelix
0
[BUG] ARM installation error
#665 opened a month ago by DavidePaglieri
0
Why LLaMA3-8B after GPTQ test in wikitext2 so bad?
#650 opened 2 months ago by lyx-111111
8
[Question] Differences in quantization logic compared to AWQ
#663 opened a month ago by wenhuach21
0
[BUG]safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer
#661 opened 2 months ago by chuangzhidan
0
[FEATURE] Backport vllm expanded Marlin kernel to autogptq.
#653 opened 2 months ago by Qubitium
1
[PR Ready for Review] [FEATURE] Extend Support for Phi-3
#652 opened 2 months ago by davidgxue
0
Why doesn't AutoGPTQ quantize lm_head layer?
#647 opened 2 months ago by XeonKHJ
5
[BUG] Llama 3 8B Instruct - `no_inject_fused_attention` must be true or else errors out
#646 opened 2 months ago by davidgxue
8
gptq 4bit avg loss is large
#643 opened 2 months ago by moseshu
3
[BUG]GPTQ QWEN-72B-Chat
#637 opened 2 months ago by tulipdu955
4
[BUG] GPTQ Kernels dont work with PEFT
#633 opened 2 months ago by achew010
0
[BUG] Regression in quantized inference when paired with Transformers >= 4.39.0
#614 opened 3 months ago by Qubitium
16
export mistral8x7b error
#644 opened 2 months ago by v-yunbin
0
[FEATURE] ADD Support DBRX
#621 opened 3 months ago by Xu-Chen
16
TypeError: forward() missing 1 required positional argument: 'hidden_states'[BUG] ?
#636 opened 2 months ago by silvacarl2
3
Error when quantizing mixtral 8x7b model. "ZeroDivisionError: float division by zero "
#634 opened 2 months ago by arceus-jia
1
Error when trying to quantize the JAIS model.
#632 opened 2 months ago by Mohammad-Faris
10
[FEATURE] Cohere integration?
#612 opened 3 months ago by DavidePaglieri
1
[FEATURE] ADD Jamba Support
#629 opened 3 months ago by TechxGenus
0
[BUG] Problem with Ellipsis and training when training custom datasets
#618 opened 3 months ago by power9799
4
[BUG] TypeError: LlamaRotaryEmbedding.forward() got an unexpected keyword argument 'seq_len'
#606 opened 3 months ago by timefliesfang
3
zeros remain zero?
#627 opened 3 months ago by RanchiZhao
0
[BUG] Marlin/AWQ cache collision + Marlin device validation returns false for Ada/Hopper
#616 opened 3 months ago by Qubitium
0
[FEATURE]autogpt-q on apple silicon
#615 opened 3 months ago by power9799
0
[BUG] Regression in commit: 24efa31
#610 opened 3 months ago by Qubitium
7
[QUESTION] How to unload AutoGPTQForCausalLM.from_quantized model from GPU to CPU in order to free up GPU memory
#611 opened 3 months ago by tommyyipmonarch
0
[BUG] Regression due to merged PR #607
#609 opened 3 months ago by Qubitium
1
[BUG] Completely Stuck when inferencing FP16 Llama2-7B
#608 opened 3 months ago by timefliesfang
0
[BUG] Transformers >= 4.39.0 broke llama quantization
#604 opened 3 months ago by Qubitium
2