Finetune lora max_seq_length error
SergioG-M opened this issue · 4 comments
I am getting an error when running litgpt finetune_lora
At the beginning of training the max_seq_length is set to 466 because that is the longest sequence in my training set
"The longest sequence length in the train data is 466, the model's maximum sequence length is 466 and context length is 2048"
However, when the training is finished and a final validation is performed in
litgpt/litgpt/finetune/lora.py
Line 214 in 0f3bca7
"Cannot forward sequence of length 473, max seq length is only 466"
There is a at least a sample in the validation set that is longer than the longest one in the training set Does anyone know how to fix this?
This is the traceback I get
File "/usr/local/lib/python3.10/dist-packages/litgpt/finetune/lora.py", line 215, in main
val_loss = validate(fabric, model, val_dataloader, dataclasses.replace(eval, max_iters=len(val_dataloader)))
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/litgpt/finetune/lora.py", line 353, in validate
logits = model(input_ids)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/lightning/fabric/wrappers.py", line 139, in forward
output = self._forward_module(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/litgpt/lora.py", line 527, in forward
raise ValueError(f"Cannot forward sequence of length {T}, max seq length is only {self.max_seq_length}.")
ValueError: Cannot forward sequence of length 473, max seq length is only 466.
Thanks for sharing. Yeah, this shouldn't happen, and the max sequence length calculation should happen on both the training and validation data not just the training data. Will have to look into this and update.
In the meantime, you could rerun the training with --train.max_seq_length 512
or so to make sure this doesn't happen in your case.
Thanks for sharing. Yeah, this shouldn't happen, and the max sequence length calculation should happen on both the training and validation data not just the training data. Will have to look into this and update.
In the meantime, you could rerun the training with
--train.max_seq_length 512
or so to make sure this doesn't happen in your case.
Thanks!
Actually, I think that train.max_seq_length is not enough, the problem comes from
litgpt/litgpt/finetune/lora.py
Line 247 in 0f3bca7
So I just changed that in my case
Should be fixed now.