Unexpected keyword argument 'bf16'

Question

Unexpected keyword argument 'bf16'

agokrani opened this issue a year ago · 14 comments

Hi,

I am trying to reproduce the setup on T4 google colab and getting following error:

Traceback (most recent call last):
File "/content/drive/MyDrive/git/weak-to-strong-fixed/train_weak_to_strong.py", line 356, in
fire.Fire(main)
File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/content/drive/MyDrive/git/weak-to-strong-fixed/train_weak_to_strong.py", line 272, in main
weak_test_results, weak_ds = train_model(
File "/content/drive/MyDrive/git/weak-to-strong-fixed/train_weak_to_strong.py", line 250, in train_model
return train_and_save_model(
File "/content/drive/MyDrive/git/weak-to-strong-fixed/weak_to_strong/train.py", line 229, in train_and_save_model
model = TransformerWithHead.from_pretrained(
File "/content/drive/MyDrive/git/weak-to-strong-fixed/weak_to_strong/model.py", line 34, in from_pretrained
return cls(name, **kwargs)
File "/content/drive/MyDrive/git/weak-to-strong-fixed/weak_to_strong/model.py", line 22, in init
lm = AutoModelForCausalLM.from_pretrained(name, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
return model_class.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3450, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
TypeError: GPT2LMHeadModel.init() got an unexpected keyword argument 'bf16'

Do why this might be the case?

srivhash commented a year ago

Nice !!

Answer 1 · 2023-12-18T20:21:35.000Z

I am getting the same. I am not certain if this is because I am using an Apple M3 (using python 3.11). I did have to work around torch.cuda... but then ran into the 'bf16' issue.

Answer 2 · 2023-12-19T05:19:35.000Z

bf16 is a data type that is only supported on high end gpus like A100.
similarly fp32 may not be supported.
Simple fix: make the kwargs parameter inside the model definition empty by commenting out the bf16 and fp32 stuff and it should work. bf16 would be required for the qwen models but the given code does not run the qwen model, hence the code will work.

Answer 3 · 2023-12-19T05:20:58.000Z

T4 gpu won't be able to run the bf16 data type

Answer 4 · 2023-12-19T08:40:24.000Z

Hi @srivhash,

The solution worked for me.

Thanks!

Answer 5 · 2023-12-20T20:05:56.000Z

PR is welcome, would be good to have custom_kwargs avoid passing invalid flags :)

Answer 6 · 2023-12-20T22:22:40.000Z

just created a PR by adding bf16 flag which defaults to False.

Answer 7 · 2023-12-21T03:18:10.000Z

zky001 commented a year ago

Answer 8 · 2023-12-21T03:39:24.000Z

seem not support

Answer 9 · 2023-12-21T06:30:07.000Z

Would that not be a sub optimal fix by removing parameters from kwargs

Answer 10 · 2023-12-21T06:35:49.000Z

Instead I created a custom function that would check if the parameters within kwargs are valid and ignore the type error by removing the enlisted parameter from model initialization recursively. I think this would be a better practice, and hence created a PR.

Howecer, I do understand the above solution is easier, upto you @WuTheFWasThat

Answer 11 · 2023-12-22T04:43:06.000Z

It seems like the bf16 and fp32 arguments are for TrainingArguments, not from_pretrained. Replacing the original custom_kwargs items with "torch_dtype": torch.bfloat16 if torch.cuda.is_bf16_supported() else torch.float32 works in my case.

Answer 12 · 2023-12-22T07:23:34.000Z

@fffffarmer can you create a pull request including your changes. Seems more promising than the other solutions here.

Answer 13 · 2024-01-23T06:27:13.000Z

thanks @fffffarmer, merged!