ServiceNow/picard

How to fine-tune/evaluate/predict on custom dataset?

Dakingrai opened this issue · 2 comments

Hi Torsten,
Thank you for such excellent work!
I have a few queries about the usage of PICARD: I want to fine-tune your model on my custom dataset and then use it to evaluate/predict the dev/test set.
Based on previous issues in this repo, this is what I am trying:
Step 1: Started training T5-base using the following train.json file
{
"run_name": "t5-train",
"model_name_or_path": "t5-base",
"dataset": "spider",
"source_prefix": "",
"schema_serialization_type": "peteshaw",
"schema_serialization_randomized": false,
"schema_serialization_with_db_id": true,
"schema_serialization_with_db_content": true,
"normalize_query": true,
"target_with_db_id": true,
"output_dir": "/train",
"cache_dir": "/transformers_cache",
"do_train": true,
"do_eval": true,
"fp16": false,
"num_train_epochs": 3072,
"per_device_train_batch_size": 20,
"per_device_eval_batch_size": 1,
"gradient_accumulation_steps": 410,
"label_smoothing_factor": 0.0,
"learning_rate": 1e-4,
"adafactor": true,
"adam_eps": 1e-6,
"lr_scheduler_type": "constant",
"warmup_ratio": 0.0,
"warmup_steps": 0,
"seed": 1,
"report_to": ["wandb"],
"logging_strategy": "steps",
"logging_first_step": true,
"logging_steps": 4,
"load_best_model_at_end": true,
"metric_for_best_model": "exact_match",
"greater_is_better": true,
"save_total_limit": 128,
"save_steps": 64,
"evaluation_strategy": "steps",
"eval_steps": 64,
"predict_with_generate": true,
"num_beams": 1,
"num_beam_groups": 1,
"use_picard": false
}
For the training, I am using an A100 with 40GB and the problem is "the execution time", it shows for the base model is ~70 hours (150sec/it).
Question 1: Is this the expected behavior?
Question 2: should I make the "use_picard" flag true? Will this make any difference in training? (I guess eval probably will change?)
Step 2: Once this model is trained I will use it to fine-tune my custom dataset.
Question 3: Can you please guide me for this step. How can I use an existing trained model to fine-tune? Also, what about the "use_picard" flag, should I use it for fine-tuning?
Step 3: Once steps 1 and step 2 are done, I will try to evaluate the fine-tuned model (from step 2) on my custom dataset.
Question 4: I believe I should use, eval docker image (or serve?) for this?
Question 5: Can I skip the whole training process at all? I mean, you have already provided certain checkpoints like (3vnuv1vf), can I use them to fine-tune on our dataset ? If yes, can you please shed some light on it?

Thank you again for the excellent work, and I look forward to hearing from you 🙂

Hi @Dakingrai!

Q1: yes, training takes time...
Q2: if you set the flag to true, the best model will be chosen based on its eval accuracy with picard. if it's turned off, the best model will be chosen without picard. in my experience, it's safe to turn it off to find a good model.
Q3: You just start training from an model checkpoint. replace "t5-base" with its path.
Q4: use the eval docker image if you want to use picard constrained inference. use the train image if you don't want/need it.
Q5: you should try those. if you use train.json and do not change anything else, there is no good reason to retrain on spider.

Good luck!

Thank you, this was really helpful!