Pinned issues
Issues
- 1
- 9
auto_gptq-0.7.1.tar.gz has inconsistent version: expected '0.7.1', but metadata has '0.7.1+cu121
#730 opened by q5sys - 3
- 7
- 2
Support InternVL2's quant now?
#732 opened by Jeremy-J-J - 3
[BUG]Language model does not see CUDA when pycharm and pytorch sees it, what could be the problem?
#725 opened by GoonerTim - 1
- 4
[BUG] Not able to install on Ubuntu 22.04 (subprocess-exited-with-error )
#680 opened by mishraaditya595 - 1
- 2
[BUG]TypeError: qwen2_moe isn't supported yet.
#724 opened by JiaXinLI98 - 1
[BUG]deepseek_v2 isn't supported yet
#733 opened by jli943 - 2
[BUG] Cannot install from source
#669 opened by victoryeo - 0
New release?
#727 opened by stoical07 - 2
[FEATURE] ChatGLM Support Added
#684 opened by Qubitium - 1
[BUG]Unable to quantize Falcon-7b
#726 opened by wchen61 - 1
[FEATURE] Quantize Embedding
#723 opened by RanchiZhao - 0
the output of 8bit Mixtral-8x7B-v0.1-GPTQ is strange
#720 opened by JustQJ - 0
I encountered a problem while determining the number of proofreading datasets
#717 opened by jiangchengchengark - 1
How to quantize models with Triton v2
#712 opened by Coco58323 - 0
- 0
- 1
- 4
CUDA extension not installed
#694 opened by yaldashbz - 1
Target modules [] not found in the base model. Please check the target modules and try again.
#667 opened by RMimo - 1
torch._C._LinAlgError: linalg.cholesky: The factorization could not be completed because the input is not positive-definite (the leading minor of order 1 is not positive-definite).[BUG]
#708 opened by Yanyao-Guan-gzu - 1
[FEATURE] Enhance pack model speed
#709 opened by DeJoker - 2
[BUG] How can I cut the final single .safetensor into pieces smaller than 4GB for each?
#707 opened by fzp0424 - 0
[BUG] ARM installation error
#665 opened by DavidePaglieri - 1
How to quantize an inherited linear layer?
#704 opened by nzomi - 0
[BUG]Qwen1.5-32B int8量化后推理异常
#705 opened by flyerming - 1
[FEATURE] pass in attention mask and input ids for calibration dataset on huggingface's GPTQconfig
#697 opened by RanchiZhao - 0
[FEATURE] Why del new_example["labels"]
#703 opened by RanchiZhao - 0
Buffers in Marlin setting
#699 opened by yaldashbz - 0
Add support for Gemma2 models.
#700 opened by markoarnauto - 1
[BUG]safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer
#661 opened by chuangzhidan - 1
- 0
[BUG] do not install auto-gpt for 910B in aarch
#692 opened by luoan7248 - 0
[Issue] wheel package for CUDA 12.1
#685 opened by sudhanshu746 - 2
[BUG] Following the quant_with_alpaca.py example but keep getting "You shouldn't move a model that is dispatched using accelerate hooks." and the model is never saved.
#670 opened by murtaza-nasir - 0
[BUG]
#681 opened by yuyu990116 - 0
- 0
How to get a dequantized model?
#679 opened by mxjmtxrm - 1
[FEATURE] ADD SUPPORT DeepSeek-V2
#664 opened by Xu-Chen - 4
[FEATURE] Added code support to 5,6,7 bits quantization can you please add me as contributor I will create a new pull request
#675 opened by thoorpukarnakar - 0
Question about data shape difference between quantization and forward
#674 opened by sleepwalker2017 - 0
How to select between different kernels?
#673 opened by sleepwalker2017 - 0
[FEATURE] Add marlin24 support
#672 opened by Qubitium - 0
[FEATURE] Models that support MOE do GPTQ
#671 opened by CallmeZhangChenchen - 0
[BUG] ROCm installation and building broken
#666 opened by xangelix - 0