KwaiKEG/KwaiAgents

kwaikeg/kagentlms_qwen_14b_mat有量化版本吗?

Closed this issue · 1 comments

akan commented

vllm,用awq有问题

from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer

model_path = '/home/ubuntu/kagentlms_qwen_14b_mat'
quant_path = 'kagentlms_qwen14bmat-awq'
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 8, "version": "GEMM" }

model = AutoAWQForCausalLM.from_pretrained(model_path, **{"low_cpu_mem_usage": True})
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

model.quantize(tokenizer, quant_config=quant_config)

model.save_quantized(quant_path)
tokenizer.save_pretrained(quant_path)

RuntimeError: cutlassF: no kernel found to launch!

用llama.cpp, python3 convert-hf-to-gguf.py ../kagentlms_qwen_14b_mat/ --outfile ../kagentlms_qwen14bmat.gguf

错误是:
FileNotFoundError: [Errno 2] No such file or directory: '../kagentlms_qwen_14b_mat/pytorch_model-00001-of-00004.bin'

暂无量化版本