Fine-tune for continuous labels
buwanim opened this issue · 4 comments
Hi,
I'm trying to finetune this for a regression problem with continuous labels. For that, I changed the 'num_labels' to 1 in the model as follows.
model = transformers.AutoModelForSequenceClassification.from_pretrained(
model_args.model_name_or_path,
cache_dir=training_args.cache_dir,
num_labels=1,
trust_remote_code=True,
)
But now I get this error. I believe this is because of the changes I made for regression. What modifications would you suggest to overcome these errors when fine-tuning for a regression problem?
Traceback (most recent call last):
File "/DNABERT2/DNABERT_2/finetune/train.py", line 319, in <module>
train()
File "/DNABERT2/DNABERT_2/finetune/train.py", line 301, in train
trainer.train()
File "/home/.local/lib/python3.9/site-packages/transformers/trainer.py", line 1664, in train
return inner_training_loop(
File "/home/.local/lib/python3.9/site-packages/transformers/trainer.py", line 1940, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/.local/lib/python3.9/site-packages/transformers/trainer.py", line 2745, in training_step
self.scaler.scale(loss).backward()
File "/home/.local/lib/python3.9/site-packages/torch/_tensor.py", line 522, in backward
torch.autograd.backward(
File "/home/.local/lib/python3.9/site-packages/torch/autograd/__init__.py", line 266, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: Found dtype Long but expected Float
Solved the issue. I had to change the label type from 'long' to 'float' in the following line in class DataCollatorForSupervisedDataset(object):
in train.py:
Original:
labels = torch.Tensor(labels).long()
Modified for regression:
labels = torch.Tensor(labels).float()
Even with the above changes, the predictions I get are all zeros. Is there anything else I should change for the model to work with continuous labels (for regression)?
I think the model is naturally applicable to regression with your modifications. Can you share more information about your fine-tuning? Does the loss look normal? If the prediction is always 0, it may means the model converges to some whird local minimum.
Even with the above changes, the predictions I get are all zeros. Is there anything else I should change for the model to work with continuous labels (for regression)?
你得同时改动'train.py'里面preprocess_logits_for_metrics()函数的代码,让它返回一个连续值用于回归而不是最大值的索引以用于分类