intel-analytics/ipex-llm

Unable to save quantized model

Opened this issue · 1 comments

I m trying to save a int4 quantized model. When i try to save it , i get this error when trying to solve the issue.
Traceback (most recent call last):
File "C:\Users\AI-Perf\Varsha\ipex-llm\python\llm\example\GPU\HF-Transformers-AutoModels\Save-Load\generate.py", line 58, in
model.save_low_bit(save_path)
File "C:\Users\AI-Perf.conda\envs\ipex-llm\Lib\site-packages\ipex_llm\transformers\model.py", line 62, in save_low_bit
delattr(self.config, "_pre_quantization_dtype")
AttributeError: 'LlamaConfig'
generate_profile_thebloke.txt
object has no attribute '_pre_quantization_dtype'

Attached is my code.

Hi, you are loading a llama2-7b-AWQ model and you want to save it to ipex-llm format? I suppose AWQ is already quantized format and you can just load it and don't need to save again?