modal-labs/llm-finetuning

Why add special tokens manually in sql_dataset.py

Closed this issue · 1 comments

In datasets/sql_dataset.py you manually add a bunch of special tokens in the prompt instead of relying on the tokenizer to handle this. Is there a good reason for that?

It was copied from the original recipes repo (which we have now deprecated) so unfortunately I'm not sure why they did so originally.