kwaikeg/kagentlms_qwen_14b_mat有量化版本吗?
akan opened this issue · 1 comments
akan commented
vllm,用awq有问题
from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer
model_path = '/home/ubuntu/kagentlms_qwen_14b_mat'
quant_path = 'kagentlms_qwen14bmat-awq'
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 8, "version": "GEMM" }
model = AutoAWQForCausalLM.from_pretrained(model_path, **{"low_cpu_mem_usage": True})
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model.quantize(tokenizer, quant_config=quant_config)
model.save_quantized(quant_path)
tokenizer.save_pretrained(quant_path)
RuntimeError: cutlassF: no kernel found to launch!
用llama.cpp, python3 convert-hf-to-gguf.py ../kagentlms_qwen_14b_mat/ --outfile ../kagentlms_qwen14bmat.gguf
错误是:
FileNotFoundError: [Errno 2] No such file or directory: '../kagentlms_qwen_14b_mat/pytorch_model-00001-of-00004.bin'
Vincentyua commented
暂无量化版本