MAGICS-LAB/DNABERT_2

Fine-tune for continuous labels

buwanim opened this issue · 4 comments

Hi,
I'm trying to finetune this for a regression problem with continuous labels. For that, I changed the 'num_labels' to 1 in the model as follows.

model = transformers.AutoModelForSequenceClassification.from_pretrained(
        model_args.model_name_or_path,
        cache_dir=training_args.cache_dir,
        num_labels=1,
        trust_remote_code=True,
    )

But now I get this error. I believe this is because of the changes I made for regression. What modifications would you suggest to overcome these errors when fine-tuning for a regression problem?

Traceback (most recent call last):
  File "/DNABERT2/DNABERT_2/finetune/train.py", line 319, in <module>
    train()
  File "/DNABERT2/DNABERT_2/finetune/train.py", line 301, in train
    trainer.train()
  File "/home/.local/lib/python3.9/site-packages/transformers/trainer.py", line 1664, in train
    return inner_training_loop(
  File "/home/.local/lib/python3.9/site-packages/transformers/trainer.py", line 1940, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "/home/.local/lib/python3.9/site-packages/transformers/trainer.py", line 2745, in training_step
    self.scaler.scale(loss).backward()
  File "/home/.local/lib/python3.9/site-packages/torch/_tensor.py", line 522, in backward
    torch.autograd.backward(
  File "/home/.local/lib/python3.9/site-packages/torch/autograd/__init__.py", line 266, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: Found dtype Long but expected Float

Solved the issue. I had to change the label type from 'long' to 'float' in the following line in class DataCollatorForSupervisedDataset(object): in train.py:

Original:
labels = torch.Tensor(labels).long()

Modified for regression:
labels = torch.Tensor(labels).float()

Even with the above changes, the predictions I get are all zeros. Is there anything else I should change for the model to work with continuous labels (for regression)?

I think the model is naturally applicable to regression with your modifications. Can you share more information about your fine-tuning? Does the loss look normal? If the prediction is always 0, it may means the model converges to some whird local minimum.

Even with the above changes, the predictions I get are all zeros. Is there anything else I should change for the model to work with continuous labels (for regression)?

你得同时改动'train.py'里面preprocess_logits_for_metrics()函数的代码,让它返回一个连续值用于回归而不是最大值的索引以用于分类