OpenGVLab/EfficientQAT
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Python
Issues
- 1
- 2
- 1
- 1
- 1
any experiments on qwen2-7b-instruct?
#22 opened by brisker - 2
Is 7B llama speed expected to be slow?
#19 opened by w32zhong - 2
Reproduce Llama3 8B Instruct results
#21 opened by gdsaikrishna - 2
- 1
why model output is too much slow?
#18 opened by dahwin - 3
Question about calib datasets.
#16 opened by mxjmtxrm - 8
GGUF
#3 opened by maxim-saplin - 1
block_ap过程中似乎不能使用多卡并行,老是爆显存
#15 opened by QB-Chen - 1
Cohere Command R Plus
#10 opened by Khaledhesham - 4
Data arrangement
#7 opened by yancaoweidaode - 1
How to do e2e_qp with multi-GPU?
#14 opened by mxjmtxrm - 1
DATA FOR TRAINING
#12 opened by LiMa-cas - 4
Loss is nan
#6 opened by mxjmtxrm - 5
Reproduce Llama2-7b
#13 opened by laomao0 - 2
Evaluation pipeline points to missing files
#11 opened by snps-tonatiuh - 6
question about real quant
#4 opened by mxjmtxrm - 2
Can not reproduce the results
#8 opened by LiuSiQi-TJ - 4
- 0
dimension mismatch error.
#5 opened by mxjmtxrm