[Bug Report] RuntimeError when running instruction fine-tuning on mistral 7b, Sagemaker Jumpstart
louishourcade opened this issue · 2 comments
Link to the notebook
https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/jumpstart-foundation-models/mistral-7b-instruction-domain-adaptation-finetuning.ipynb
Describe the bug
I get an error when I run the training step for instruction fine-tuning in this notebook. The training job starts properly, but after ~10min it fails and raises: ErrorMessage "raise RuntimeError( RuntimeError: Could not find response key [1, 32002] in token IDs tensor([ 1, 20811, 349, ..., 302, 15637, 266])
To reproduce
- Upload the notebook in a Sagemaker Notebook
- Run every cell, the error appears when running the instruction-fine tuning training job (1.3 Starting Training section)
Logs
Attaching some screenshots of the logs
Any idea on how to fix this ?
@louishourcade: Facing same issue while running the example notebook from AWS. Did you find the solution?
Hi @prakash5801, no I didn't find time to investigate more. But I saw yesterday that the error is still there