AttributeError: 'QuantLinear' object has no attribute 'weight' (t5 branch) (Google/flan-ul2)
sigmareaver opened this issue · 2 comments
i7-13700k
128GB RAM
RTX 4090
Python = 3.9.10
Transformers = 4.30.0.dev0
PyTorch = 2.0.1
Model = Google/flan-ul2
Quantization command:
PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512 python t5.py ../full-models/flan-ul2 wikitext2 --nsamples 256 --wbits 4 --act-order --groupsize 128 --save ../gptq-models/flan-ul2-gptq/flan-ul2-4bit-128g-gptq.pt
I also needed to edit t5_sequential()
to run on 24GB of VRAM, but I don't think this should affect the model? The following snippet shows the extent of my changes, except for import gc
at the top of the file.
del layer
del gptq
gc.collect()
torch.cuda.empty_cache()
inps, outs = outs, inps
# do this part on CPU, because GPU runs out of memory
dev = 'cpu'
model.encoder.final_layer_norm = model.encoder.final_layer_norm.to(dev)
model.encoder.dropout = model.encoder.dropout.to(dev)
encoder_hidden_states = model.encoder.final_layer_norm(inps.cpu())
encoder_hidden_states = model.encoder.dropout(encoder_hidden_states)
model.encoder.final_layer_norm = model.encoder.final_layer_norm.cpu()
model.encoder.dropout = model.encoder.dropout.cpu()
dev = 'cuda:0'
encoder_hidden_states = encoder_hidden_states.to(dev)
inps = inps.to(dev)
# end of CPU section
Otherwise my 4090 runs out of memory when trying to load model.encoder.final_layer_norm = model.encoder.final_layer_norm.to(dev)
to the GPU.
Benchmark command (also applies to t5_inference.py):
python t5.py ../full-models/flan-ul2 wikitext2 --load ../gptq-models/flan-ul2-gptq/flan-ul2-4bit-128g-gptq.pt --wbits 4 --groupsize 128 --benchmark --benchmark_mode mmlu
Yields the following error:
Traceback (most recent call last):
File "/mnt/Storage/ai-dev/t5-gptq/t5.py", line 752, in <module>
mmlu_benchmark(model, tokenizer, args)
File "/mnt/Storage/ai-dev/t5-gptq/t5.py", line 542, in mmlu_benchmark
cors, acc, probs = mmlu_eval(args, subject, model, tokenizer, dev_df, test_df, (idx,len(subjects)))
File "~/anaconda3/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/mnt/Storage/ai-dev/t5-gptq/t5.py", line 473, in mmlu_eval
logits = model(
File "~/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "~/anaconda3/lib/python3.9/site-packages/transformers/models/t5/modeling_t5.py", line 1683, in forward
encoder_outputs = self.encoder(
File "~/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "~/anaconda3/lib/python3.9/site-packages/transformers/models/t5/modeling_t5.py", line 1090, in forward
layer_outputs = layer_module(
File "~/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "~/anaconda3/lib/python3.9/site-packages/transformers/models/t5/modeling_t5.py", line 753, in forward
hidden_states = self.layer[-1](hidden_states)
File "~/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "~/anaconda3/lib/python3.9/site-packages/transformers/models/t5/modeling_t5.py", line 342, in forward
forwarded_states = self.DenseReluDense(forwarded_states)
File "~/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "~/anaconda3/lib/python3.9/site-packages/transformers/models/t5/modeling_t5.py", line 319, in forward
isinstance(self.wo.weight, torch.Tensor)
File "~/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1614, in __getattr__
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'QuantLinear' object has no attribute 'weight'
Edit: added snippet showing code modifications, and edited quantization command to show PYTORCH_CUDA_ALLOC_CONF environment variable.
Not sure what I did differently, but it started suggesting qweight now...
AttributeError: 'QuantLinear' object has no attribute 'weight'. Did you mean: 'qweight'?
My apologies. It seems a requirement was somehow not installed, or overwritten with your transformers-t5 repo.