why the finetuning takes more time compared with alpaca
waterhorse1 opened this issue · 2 comments
waterhorse1 commented
Alpaca's finetuning takes about 3h, however for gpt-4 data in this repo, it needs about 13h on 8 A100 GPUs. What is the difference?
Instruction-Tuning-with-GPT-4 commented
Can you increase the batch size to 4?
waterhorse1 commented
the hyperparameter is set bs=1 for 16 V100 so I set it to 2. I will try 4.