Help with reproducing T5-3b number on Spider
Closed this issue · 2 comments
Hi,
I'm trying to reproduce the Table2 ST number with T5-3B on Spider.
I'm using the following command on 16 A100 GPUs:
deepspeed train.py --deepspeed deepspeed/ds_config_zero2.json --seed 2 --cfg Salesforce/T5_3b_finetune_spider_with_cell_value.cfg --run_name T5_3b_finetune_spider --logging_strategy steps --logging_first_step true --logging_steps 4 --evaluation_strategy steps --eval_steps 250 --metric_for_best_model avr --greater_is_better true --save_strategy steps --save_steps 250 --save_total_limit 1 --load_best_model_at_end --gradient_accumulation_steps 8 --num_train_epochs 80 --adafactor false --learning_rate 5e-5 --do_train --do_eval --do_predict --predict_with_generate --output_dir output/T5_3b_finetune_spider --overwrite_output_dir --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --generation_num_beams 4 --generation_max_length 128 --input_max_length 1024 --ddp_find_unused_parameters true
Does it look right? I get 68.83 using this command. Could you help me with the command that can reproduce 71.76 on Spider? Thanks!
Hi, can you try --generation_num_beams 1
instead of --generation_num_beams 4
? It should produce better results.
I'll close this issue. If you have further questions, feel free to re-open it!