Pinned issues
Issues
- 1
[BUG] Not able to install on Ubuntu 22.04 (subprocess-exited-with-error )
#680 opened by mishraaditya595 - 6
Llama-3 8B Instruct quantized to 8 Bit spits out gibberish in transformers `model.generate()` but works fine in vLLM?
#657 opened by davidgxue - 0
[Issue] wheel package for CUDA 12.1
#685 opened by sudhanshu746 - 0
[FEATURE] ChatGLM Support Added
#684 opened by Qubitium - 2
[BUG] Following the quant_with_alpaca.py example but keep getting "You shouldn't move a model that is dispatched using accelerate hooks." and the model is never saved.
#670 opened by murtaza-nasir - 0
[BUG]
#681 opened by yuyu990116 - 0
- 0
How to get a dequantized model?
#679 opened by mxjmtxrm - 1
[FEATURE] ADD SUPPORT DeepSeek-V2
#664 opened by Xu-Chen - 0
- 4
[FEATURE] Added code support to 5,6,7 bits quantization can you please add me as contributor I will create a new pull request
#675 opened by thoorpukarnakar - 0
Question about data shape difference between quantization and forward
#674 opened by sleepwalker2017 - 0
How to select between different kernels?
#673 opened by sleepwalker2017 - 5
- 0
[FEATURE] Add marlin24 support
#672 opened by Qubitium - 0
[FEATURE] Models that support MOE do GPTQ
#671 opened by CallmeZhangChenchen - 1
[BUG] Can not save quantized model to disk: "you shouldn't move a model that is dispatched using accelerate hooks."
#630 opened by tattrongvu - 0
[BUG] Cannot install from source
#669 opened by victoryeo - 0
Target modules [] not found in the base model. Please check the target modules and try again.
#667 opened by RicardoHalak - 6
What magnitude of avg loss indicates a relatively good result for a quantization model
#649 opened by ehuaa - 0
[BUG] ROCm installation and building broken
#666 opened by xangelix - 0
[BUG] ARM installation error
#665 opened by DavidePaglieri - 8
Why LLaMA3-8B after GPTQ test in wikitext2 so bad?
#650 opened by lyx-111111 - 0
- 0
[BUG]safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer
#661 opened by chuangzhidan - 1
- 0
- 5
Why doesn't AutoGPTQ quantize lm_head layer?
#647 opened by XeonKHJ - 8
[BUG] Llama 3 8B Instruct - `no_inject_fused_attention` must be true or else errors out
#646 opened by davidgxue - 3
gptq 4bit avg loss is large
#643 opened by moseshu - 4
[BUG]GPTQ QWEN-72B-Chat
#637 opened by tulipdu955 - 0
[BUG] GPTQ Kernels dont work with PEFT
#633 opened by achew010 - 16
[BUG] Regression in quantized inference when paired with Transformers >= 4.39.0
#614 opened by Qubitium - 0
export mistral8x7b error
#644 opened by v-yunbin - 16
[FEATURE] ADD Support DBRX
#621 opened by Xu-Chen - 3
TypeError: forward() missing 1 required positional argument: 'hidden_states'[BUG] ?
#636 opened by silvacarl2 - 1
Error when quantizing mixtral 8x7b model. "ZeroDivisionError: float division by zero "
#634 opened by arceus-jia - 10
Error when trying to quantize the JAIS model.
#632 opened by Mohammad-Faris - 1
[FEATURE] Cohere integration?
#612 opened by DavidePaglieri - 0
[FEATURE] ADD Jamba Support
#629 opened by TechxGenus - 4
- 3
[BUG] TypeError: LlamaRotaryEmbedding.forward() got an unexpected keyword argument 'seq_len'
#606 opened by timefliesfang - 0
zeros remain zero?
#627 opened by RanchiZhao - 0
[BUG] Marlin/AWQ cache collision + Marlin device validation returns false for Ada/Hopper
#616 opened by Qubitium - 0
[FEATURE]autogpt-q on apple silicon
#615 opened by power9799 - 7
[BUG] Regression in commit: 24efa31
#610 opened by Qubitium - 0
[QUESTION] How to unload AutoGPTQForCausalLM.from_quantized model from GPU to CPU in order to free up GPU memory
#611 opened by tommyyipmonarch - 1
[BUG] Regression due to merged PR #607
#609 opened by Qubitium - 0
- 2