Instruction-Tuning-with-GPT-4/GPT-4-LLM

why the finetuning takes more time compared with alpaca

waterhorse1 opened this issue · 2 comments

Alpaca's finetuning takes about 3h, however for gpt-4 data in this repo, it needs about 13h on 8 A100 GPUs. What is the difference?

Can you increase the batch size to 4?

the hyperparameter is set bs=1 for 16 V100 so I set it to 2. I will try 4.