AttributeError: 'HQQLinear' object has no attribute 'weight'

Question

AttributeError: 'HQQLinear' object has no attribute 'weight'

mxjmtxrm opened this issue 8 months ago · 8 comments

mxjmtxrm commented 8 months ago

System Info

transformers version: 4.41.0.dev0
Platform: Linux-5.15.0-92-generic-x86_64-with-glibc2.35
Python version: 3.10.12
Huggingface_hub version: 0.21.4
Safetensors version: 0.4.2
Accelerate version: 0.28.0
Accelerate config: not found
PyTorch version (GPU?): 2.2.0a0+81ea7a4 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?:
Using distributed or parallel set-up in script?:

Who can help?

@SunMarc and @younesbelkada

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

When I load model with HQQ quantization config, I met the following error:

File "/workspace/code/utils.py", line 88, in create_and_prepare_model
    model = AutoModelForCausalLM.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
    return model_class.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3693, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 4127, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 848, in _load_state_dict_into_meta_model
    old_param = getattr(old_param, split)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1687, in __getattr__
    raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'HQQLinear' object has no attribute 'weight'

It seems that there is a bug in quantizer_hqq.py:

def check_quantized_param(
        self,
        model: "PreTrainedModel",
        param_value: "torch.Tensor",
        param_name: str,
        state_dict: Dict[str, Any],
        **kwargs,
    ) -> bool:
        module, tensor_name = get_module_from_name(model, param_name)

        return isinstance(module, torch.nn.Linear)

In the above code, it should check if the tensor_name is 'bias'. Otherwise the layer will be replace by HQQlinear if the tensor_name is bias.
How to solve this problem?

Expected behavior

--

Answer 1 · 2024-05-10T07:44:54.000Z

Hi @mxjmtxrm
Can you share which model are you trying to quantize?

Answer 2 · 2024-05-10T07:53:17.000Z

It is my own model based on hf llama2 7B. I just modify the bias of qkv proj is True. So the pretrained ckpt contains .bias, and then the above error arised.

Answer 3 · 2024-05-10T12:13:10.000Z

I use the distil-whisper model. I'm getting the same error. BitsAndBytesConfig optimization method works. HQQ method gives error.

Answer 4 · 2024-05-10T16:43:11.000Z

@mobicham,

Can you check?

Answer 5 · 2024-05-10T16:54:56.000Z

Can you share a code snippet to reproduce this please?

Answer 6 · 2024-05-10T17:15:24.000Z

Thanks everyone ! Indeed I was able to repro with:

from transformers import AutoModelForSpeechSeq2Seq, HqqConfig

model_id = "distil-whisper/distil-large-v2"

quant_config  = HqqConfig(nbits=1, group_size=64, quant_zero=False, quant_scale=False, axis=0)

model = AutoModelForSpeechSeq2Seq.from_pretrained(model_id, quantization_config=quant_config, device_map="cuda")
print(model)

Answer 7 · 2024-05-10T17:15:58.000Z

Fixed it, will do a PR right now.

Answer 8 · 2024-05-10T17:16:36.000Z

Awesome, thanks !