Qwen2 GPTQ break in cpp_model.Model.np_bestla_qpack
Closed this issue · 1 comments
Hi,
The model I used is Qwen1.5-0.5B-Chat-GPTQ-int4 from huggingface.
After debugging, it seems the model cannot be converted correctly by:
cpp_model.Model.np_bestla_qpack(
it breaks here without error or message shown.
The program can still continue to run though. And eventually it will show error when trying generation for the first time:
error loading model: model.cpp: tensor 'model.layers.0.self_attn.q_proj.weight' is missing from model
@yuchen2580 Hi, thanks for your issue.
I noticed this problem for Qwen1.5-0.5B-Chat-GPTQ-Int4.
I think this GPTQ model has some problems probably. It should be from https://hf-mirror.com/Qwen/Qwen1.5-0.5B.
In Qwen1.5-0.5B-Chat-GPTQ-Int4, there is no lm_head.weight
in the model.safetensor. But original Qwen1.5-0.5B, it has this weight.
if you use these commands to check them, you will see this weird problem.
from safetensors.torch import load_file
tensors = load_file("model.safetensors")
tensors.keys()
Qwen1.5-0.5B-Chat-GPTQ-Int4 also doesn't work on the HF, which means I suspect there is something wrong with this model.
Another way, you can try https://huggingface.co/Qwen/Qwen1.5-7B-Chat-GPTQ-Int4. it works.