openai/weak-to-strong

Unexpected keyword argument 'bf16'

agokrani opened this issue · 14 comments

Hi,

I am trying to reproduce the setup on T4 google colab and getting following error:

Traceback (most recent call last):
File "/content/drive/MyDrive/git/weak-to-strong-fixed/train_weak_to_strong.py", line 356, in
fire.Fire(main)
File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/content/drive/MyDrive/git/weak-to-strong-fixed/train_weak_to_strong.py", line 272, in main
weak_test_results, weak_ds = train_model(
File "/content/drive/MyDrive/git/weak-to-strong-fixed/train_weak_to_strong.py", line 250, in train_model
return train_and_save_model(
File "/content/drive/MyDrive/git/weak-to-strong-fixed/weak_to_strong/train.py", line 229, in train_and_save_model
model = TransformerWithHead.from_pretrained(
File "/content/drive/MyDrive/git/weak-to-strong-fixed/weak_to_strong/model.py", line 34, in from_pretrained
return cls(name, **kwargs)
File "/content/drive/MyDrive/git/weak-to-strong-fixed/weak_to_strong/model.py", line 22, in init
lm = AutoModelForCausalLM.from_pretrained(name, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
return model_class.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3450, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
TypeError: GPT2LMHeadModel.init() got an unexpected keyword argument 'bf16'

Do why this might be the case?

I am getting the same. I am not certain if this is because I am using an Apple M3 (using python 3.11). I did have to work around torch.cuda... but then ran into the 'bf16' issue.

bf16 is a data type that is only supported on high end gpus like A100.
similarly fp32 may not be supported.
Simple fix: make the kwargs parameter inside the model definition empty by commenting out the bf16 and fp32 stuff and it should work. bf16 would be required for the qwen models but the given code does not run the qwen model, hence the code will work.

T4 gpu won't be able to run the bf16 data type

Hi @srivhash,

The solution worked for me.

Thanks!

Nice !!

PR is welcome, would be good to have custom_kwargs avoid passing invalid flags :)

just created a PR by adding bf16 flag which defaults to False.

zky001 commented

1703128656114

zky001 commented

image
seem not support

Would that not be a sub optimal fix by removing parameters from kwargs

Instead I created a custom function that would check if the parameters within kwargs are valid and ignore the type error by removing the enlisted parameter from model initialization recursively. I think this would be a better practice, and hence created a PR.

Howecer, I do understand the above solution is easier, upto you @WuTheFWasThat

It seems like the bf16 and fp32 arguments are for TrainingArguments, not from_pretrained. Replacing the original custom_kwargs items with "torch_dtype": torch.bfloat16 if torch.cuda.is_bf16_supported() else torch.float32 works in my case.

@fffffarmer can you create a pull request including your changes. Seems more promising than the other solutions here.

thanks @fffffarmer, merged!