Issues
- 0
Transformer 4.46.1 compat
#24 opened by Qubitium - 18
Condition to achieve linear speedup?
#15 opened by jiwonsong-dev - 7
rotation+gptq data
#20 opened by Andy0422 - 5
关于qwen2-1.5b模型的问题
#23 opened by darrenearl - 2
Question about building W4A8 on AMD platform
#22 opened by XIAOHUIL1 - 1
rotate + lm_head quantization
#21 opened by RanchiZhao - 7
Question on rotation
#13 opened by cli99 - 18
Qwen2-1.5B 量化后精度完全不可用
#17 opened by Juelianqvq - 1
- 0
关于Marlin fetch_to_registers的问题
#19 opened by darrenearl - 7
How to use custom calib data?
#1 opened by Juelianqvq - 6
- 2
Qwen2-72B-Instruct packing failed
#16 opened by Juelianqvq - 5
Qwen2 supported?
#14 opened by Juelianqvq - 2
Plz share some calibration dataset or examples
#11 opened by skykiseki - 1
Does QQQ linear support H100?
#12 opened by donglinz - 1
smooth.py报错
#8 opened by darrenearl - 1
关于group_size的问题
#10 opened by darrenearl - 2
使用QQQ W4A8量化后的模型好像有问题。。。
#7 opened by Zhao-Dongyu - 24
[QST] Speedup of GEMM
#3 opened by Hongbosherlock - 5
Can MLA be smoothed?
#6 opened by RanchiZhao - 1
What is the prior for loss/error?
#4 opened by RanchiZhao - 3
[New Model Supported] MiniCPM-2.4B
#5 opened by RanchiZhao - 30
[QST] Scale factors and benchmarks
#2 opened by jeromeku