AttributeError: 'HQQLinear' object has no attribute 'weight'
mxjmtxrm opened this issue · 8 comments
System Info
transformers
version: 4.41.0.dev0- Platform: Linux-5.15.0-92-generic-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.21.4
- Safetensors version: 0.4.2
- Accelerate version: 0.28.0
- Accelerate config: not found
- PyTorch version (GPU?): 2.2.0a0+81ea7a4 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
When I load model with HQQ quantization config, I met the following error:
File "/workspace/code/utils.py", line 88, in create_and_prepare_model
model = AutoModelForCausalLM.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
return model_class.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3693, in from_pretrained
) = cls._load_pretrained_model(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 4127, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 848, in _load_state_dict_into_meta_model
old_param = getattr(old_param, split)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1687, in __getattr__
raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'HQQLinear' object has no attribute 'weight'
It seems that there is a bug in quantizer_hqq.py
:
def check_quantized_param(
self,
model: "PreTrainedModel",
param_value: "torch.Tensor",
param_name: str,
state_dict: Dict[str, Any],
**kwargs,
) -> bool:
module, tensor_name = get_module_from_name(model, param_name)
return isinstance(module, torch.nn.Linear)
In the above code, it should check if the tensor_name is 'bias'. Otherwise the layer will be replace by HQQlinear if the tensor_name is bias.
How to solve this problem?
Expected behavior
--
Hi @mxjmtxrm
Can you share which model are you trying to quantize?
It is my own model based on hf llama2 7B. I just modify the bias of qkv proj is True. So the pretrained ckpt contains .bias
, and then the above error arised.
I use the distil-whisper model. I'm getting the same error. BitsAndBytesConfig optimization method works. HQQ method gives error.
Can you share a code snippet to reproduce this please?
Thanks everyone ! Indeed I was able to repro with:
from transformers import AutoModelForSpeechSeq2Seq, HqqConfig
model_id = "distil-whisper/distil-large-v2"
quant_config = HqqConfig(nbits=1, group_size=64, quant_zero=False, quant_scale=False, axis=0)
model = AutoModelForSpeechSeq2Seq.from_pretrained(model_id, quantization_config=quant_config, device_map="cuda")
print(model)
Fixed it, will do a PR right now.
Awesome, thanks !