OpenGVLab/EfficientQAT

EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

Python

Issues

Unable to reproduce the result in paper for 4 bit wikitext perplexity
#24 opened 11 days ago by huoshenlaile
1
Is it possible to run e2e-qp process on a single 4090?
#23 opened a month ago by sihouzi21c
2
Can the results of the quantification process be saved?
#17 opened 2 months ago by QB-Chen
1
RuntimeError: Triton Error [CUDA]: device kernel image is invalid
#20 opened 2 months ago by Niko-zyf
1
any experiments on qwen2-7b-instruct?
#22 opened 2 months ago by brisker
1
Is 7B llama speed expected to be slow?
#19 opened 3 months ago by w32zhong
2
Reproduce Llama3 8B Instruct results
#21 opened 2 months ago by gdsaikrishna
2
Quip# inference speed in your paper is incorrect?
#9 opened 3 months ago by sankeerth95
2
why model output is too much slow?
#18 opened 3 months ago by dahwin
1
Question about calib datasets.
#16 opened 3 months ago by mxjmtxrm
3
GGUF
#3 opened 4 months ago by maxim-saplin
8
block_ap过程中似乎不能使用多卡并行，老是爆显存
#15 opened 3 months ago by QB-Chen
1
Cohere Command R Plus
#10 opened 3 months ago by Khaledhesham
1
Data arrangement
#7 opened 3 months ago by yancaoweidaode
4
How to do e2e_qp with multi-GPU?
#14 opened 3 months ago by mxjmtxrm
1
DATA FOR TRAINING
#12 opened 3 months ago by LiMa-cas
1
Loss is nan
#6 opened 3 months ago by mxjmtxrm
4
Reproduce Llama2-7b
#13 opened 3 months ago by laomao0
5
Evaluation pipeline points to missing files
#11 opened 4 months ago by snps-tonatiuh
2
question about real quant
#4 opened 4 months ago by mxjmtxrm
6
Can not reproduce the results
#8 opened 4 months ago by LiuSiQi-TJ
2
Any experiments on w4a16 with group_size=-1( no group)?
#2 opened 4 months ago by brisker
4
dimension mismatch error.
#5 opened 4 months ago by mxjmtxrm
0