hyintell/BLOOM-fine-tuning

Could you please share the format of your dataset?

judyhappy opened this issue · 2 comments

I tried your script with dataset like:
[{
“inputs”: "here my question",
"targets": "the target output of model",
"text": ""
}]

But the fine-tuned model can't generate any output.
Is it due to my "text" column empty?

I add this empty 'text' column because of this error :

tokenized = split_dataset.map(
ValueError: Column to remove ['text'] not in the dataset. Current columns in the dataset: ['inputs', 'targets']

looks the 'text' column is mandatory.

For the data format, please follow: https://huggingface.co/datasets/tatsu-lab/alpaca