AssertionError: Fail to convert pytorch model
anthony-intel opened this issue · 3 comments
anthony-intel commented
this is using the example code only
from transformers import AutoTokenizer, TextStreamer
from intel_extension_for_transformers.transformers import AutoModelForCausalLM
model_name = "Intel/neural-chat-7b-v3-1" # Hugging Face model_id or local model
prompt = "Once upon a time, there existed a little girl,"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
inputs = tokenizer(prompt, return_tensors="pt").input_ids
streamer = TextStreamer(tokenizer)
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True)
outputs = model.generate(inputs, streamer=streamer, max_new_tokens=300)
print(outputs)
yields
2024-03-27 02:12:43 [INFO] Using Neural Speed...
2024-03-27 02:12:43 [INFO] cpu device is used.
2024-03-27 02:12:43 [INFO] Applying Weight Only Quantization.
2024-03-27 02:12:43 [INFO] Using LLM runtime.
cmd: ['python', PosixPath('/usr/local/lib/python3.10/dist-packages/neural_speed/convert/convert_mistral.py'), '--outfile', 'runtime_outs/ne_mistral_f32.bin', '--outtype', 'f32', 'Intel/neural-chat-7b-v3-1']
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
[<ipython-input-17-40dcb74a8701>](https://localhost:8080/#) in <cell line: 10>()
8 streamer = TextStreamer(tokenizer)
9
---> 10 model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True)
11 outputs = model.generate(inputs, streamer=streamer, max_new_tokens=300)
12 print(outputs)
1 frames
[/usr/local/lib/python3.10/dist-packages/neural_speed/__init__.py](https://localhost:8080/#) in init(self, model_name, use_quant, use_gptq, use_awq, use_autoround, weight_dtype, alg, group_size, scale_dtype, compute_dtype, use_ggml)
129 if not os.path.exists(fp32_bin):
130 convert_model(model_name, fp32_bin, "f32")
--> 131 assert os.path.exists(fp32_bin), "Fail to convert pytorch model"
132
133 if not use_quant:
AssertionError: Fail to convert pytorch model
zhentaoyu commented
Hi, this issue seems to have the same reason as #193. pip install neural_speed
won't install all packages from requirements.txt and we are trying to fix it now. You can use pip install -r requirements.txt
for a quick fix. Thanks.
anthony-intel commented
@zhentaoyu thanks - looking forward to the fix
zhentaoyu commented
Hi, @anthony-intel, now the issue is fixed and please refer to https://github.com/intel/neural-speed?tab=readme-ov-file#installation. If you have no other questions, we will close this issue. Thanks.