Issues
- 1
question about Hadamard dimension
#44 opened by mxjmtxrm - 1
- 1
Reproducing paper Table 8
#43 opened by mjyun01 - 3
Question about rotation.
#21 opened by mxjmtxrm - 3
- 3
- 5
[Q] Having not matched size Hadamard matrix
#41 opened by Coco58323 - 1
- 1
- 1
questions about the rotate
#39 opened by Gloria2tt - 1
Accuracy drop after `fuse_layer_norms`
#34 opened by Niko-zyf - 6
- 2
Inference
#37 opened by zhentingqi - 14
- 3
opt model ppl bug
#12 opened by zhsky2017 - 1
- 1
Mistral support
#35 opened by DavidePaglieri - 4
mlp_sizes seem wrong in qlinear_benchmark.py
#33 opened by yyfcc17 - 3
args.distribute_model seems to be undefined
#31 opened by WeiMa01 - 6
- 3
- 8
Other quantization results of rotated model
#25 opened by mxjmtxrm - 4
opt model with layernorm, the input of layernorm can use hadamard transform?
#29 opened by JiangYongYu1 - 1
How to deal with GQA?
#20 opened by mxjmtxrm - 3
Relations with SpinQuant?
#28 opened by RanchiZhao - 6
How to get models with only offline rotation (or models for weight-only quantization)
#24 opened by Tracin - 12
accuracy of weight only quantization decrease significantly after weight rotation
#22 opened by luchangli03 - 1
- 2
- 1
Question about exact_had_to_linear
#23 opened by mxjmtxrm - 2
- 1
multi GPU inference
#19 opened by hensiesp32 - 1
- 4
Question about reproducing Fig.1
#14 opened by xinghaow99 - 1
How to get a fake quantized model?
#18 opened by mxjmtxrm - 1
Can we directly load a QuaRot-GPTQ quantized model and do lm_eval evaluation?
#13 opened by Shuai-Xie - 4
Questions on online quantization
#11 opened by lzhangzz - 1
Some questions
#9 opened by catid - 0
Online hadamard bug
#10 opened by nailimixaM - 4
Do I need to use merge a hadamard matrix into W_v if I only want to do 4 bit KV caching?
#5 opened by YLGH - 12
Applying rotation to HuggingFace model
#1 opened by YLGH - 1