Custom data file for classification seems to be failing

Question

Custom data file for classification seems to be failing

zippeurfou opened this issue 3 years ago · 7 comments

🐛 Bug

When training a classification model on custom data file, the training fails because it expect num_classes

To Reproduce

Use this collab:
https://colab.research.google.com/drive/1uamw6SNaOr_4ch24JNxAj2yfgLUKfJqO?usp=sharing
Error:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/lightning_transformers/cli/train.py", line 84, in hydra_entry
    main(cfg)
  File "/usr/local/lib/python3.7/dist-packages/lightning_transformers/cli/train.py", line 78, in main
    logger=logger,
  File "/usr/local/lib/python3.7/dist-packages/lightning_transformers/cli/train.py", line 61, in run
    trainer.fit(model, datamodule=data_module)
  File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 458, in fit
    self.call_setup_hook(model)
  File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1066, in call_setup_hook
    model.setup(stage_name)
  File "/usr/local/lib/python3.7/dist-packages/lightning_transformers/core/model.py", line 88, in setup
    self.configure_metrics(stage)
  File "/usr/local/lib/python3.7/dist-packages/lightning_transformers/task/nlp/text_classification/model.py", line 61, in configure_metrics
    self.prec = Precision(num_classes=self.num_classes)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 948, in __getattr__
    type(self).__name__, name))
AttributeError: 'TextClassificationTransformer' object has no attribute 'num_classes'

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Expected behavior

It should start training.

Environment

check the notebook

Answer 1 · 2021-04-26T00:57:50.000Z

same here~

$ python train.py task=nlp/text_classification dataset=nlp/text_classification/emotion
...
torch.nn.modules.module.ModuleAttributeError: 'TextClassificationTransformer' object has no attribute 'num_classes'

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Answer 2 · 2021-04-28T11:29:59.000Z

Thanks guys! This makes sense, I think the right approach to fix this would be to either infer the number of classes from the data (by collecting all unique labels, which I think HF Datasets supports) or to allow the user to pass this in.

Answer 3 · 2021-05-27T15:27:29.000Z

this error only for cli
when u fixed this, or how I can do this myself?

Answer 4 · 2021-05-27T16:04:17.000Z

but, when run predict accept next error

Answer 5 · 2021-05-28T06:47:15.000Z

am running into the same issue, is there any workarounds?
'''
TextClassificationTransformer' object has no attribute 'num_classes'
'''

Answer 6 · 2021-07-27T08:05:07.000Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Answer 7 · 2021-12-22T22:00:05.000Z

see #216 and #215