georgian-io/Multimodal-Toolkit

AttributeError: 'OurTrainingArguments' object has no attribute 'deepspeed_plugin'

levente-murgas opened this issue · 5 comments

Describe the bug
When trying to run main.py with my own config JSON file, as it is advised in the README, the execution is aborted with an error
AttributeError: 'OurTrainingArguments' object has no attribute 'deepspeed_plugin'.

To Reproduce
Steps to reproduce the behavior:

  1. Create new virtualenv (thus we get a clean slate).
  2. Clone repository containing main.py from GitHub.
  3. Install the module with pip: pip install multimodal-transformers
  4. Pass an appropriate JSON file to the scrip and run main.py:
    python /content/Multimodal-Toolkit/main.py /content/data/config.json
  5. See error

Expected behavior
The selected model should be trained with the configuration given in the JSON.

Desktop:

  • OS: Windows 11
  • Python: 3.10.12
  • Environment: Colab

Hi @levente-murgas, were you able to run the code using the sample configs that are present?

Hi @akashsaravanan-georgian

I tried running the sample configs with
$python ./Multimodal-Toolkit/main.py ./Multimodal-Toolkit/datasets/Melbourne_Airbnb_Open_Data/train_config.json, but it fails with the same error sequence as it did with my own config: first it complains that accelerate library is not installed:
File \venv\lib\site-packages\transformers\trainer.py", line 3801, in create_accelerator_and_postprocess if version.parse(accelerate_version) > version.parse("0.20.3"): NameError: name 'accelerate_version' is not defined

Then I install accelerate with pip install accelerate.

Then I run it again and get the deepspeed_plugin error.
Using the sample configs still would not change the fact that OurTrainingArguments class has no deepspeed_plugin attribute.

Hi @levente-murgas, I think I've figured out the issue. The latest version of transformers seems to have broken something in the library. Could you downgrade transformers to pip install transformers==4.26.1 and see if the same error occurs?

Hi @akashsaravanan-georgian,

That's what I was thinking, too. Downgrading to 4.26.1 indeed solved the issue. I have a different question regarding the implementation of the MultimodalDataTrainingArguments class. In the implementation the num_classes attribute is used to tell the number of classes for the classification. If I use the default value (-1), will the program infer the number of classes for itself (e.g: from the label_col in column_info or from somewhere else) or do I have to explicitly provide the number of classes in num_classes for the program in order to work as expected?

I believe that's the case. If you pass in -1, it uses the number of unique labels in your training data.