huggingface/transformers

AttributeError: 'HQQLinear' object has no attribute 'weight'

mxjmtxrm opened this issue · 8 comments

System Info

  • transformers version: 4.41.0.dev0
  • Platform: Linux-5.15.0-92-generic-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • Huggingface_hub version: 0.21.4
  • Safetensors version: 0.4.2
  • Accelerate version: 0.28.0
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.2.0a0+81ea7a4 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

@SunMarc and @younesbelkada

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

When I load model with HQQ quantization config, I met the following error:

File "/workspace/code/utils.py", line 88, in create_and_prepare_model
    model = AutoModelForCausalLM.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
    return model_class.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3693, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 4127, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 848, in _load_state_dict_into_meta_model
    old_param = getattr(old_param, split)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1687, in __getattr__
    raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'HQQLinear' object has no attribute 'weight'

It seems that there is a bug in quantizer_hqq.py:

def check_quantized_param(
        self,
        model: "PreTrainedModel",
        param_value: "torch.Tensor",
        param_name: str,
        state_dict: Dict[str, Any],
        **kwargs,
    ) -> bool:
        module, tensor_name = get_module_from_name(model, param_name)

        return isinstance(module, torch.nn.Linear)

In the above code, it should check if the tensor_name is 'bias'. Otherwise the layer will be replace by HQQlinear if the tensor_name is bias.
How to solve this problem?

Expected behavior

--

Hi @mxjmtxrm
Can you share which model are you trying to quantize?

It is my own model based on hf llama2 7B. I just modify the bias of qkv proj is True. So the pretrained ckpt contains .bias, and then the above error arised.

I use the distil-whisper model. I'm getting the same error. BitsAndBytesConfig optimization method works. HQQ method gives error.

@mobicham,

Can you check?

Can you share a code snippet to reproduce this please?

Thanks everyone ! Indeed I was able to repro with:

from transformers import AutoModelForSpeechSeq2Seq, HqqConfig

model_id = "distil-whisper/distil-large-v2"

quant_config  = HqqConfig(nbits=1, group_size=64, quant_zero=False, quant_scale=False, axis=0)

model = AutoModelForSpeechSeq2Seq.from_pretrained(model_id, quantization_config=quant_config, device_map="cuda")
print(model)

Fixed it, will do a PR right now.

Awesome, thanks !