why the finetuning takes more time compared with alpaca

Question

waterhorse1 opened this issue 2 years ago · 2 comments

Alpaca's finetuning takes about 3h, however for gpt-4 data in this repo, it needs about 13h on 8 A100 GPUs. What is the difference?

Answer 1 · 2023-04-10T16:36:57.000Z

Can you increase the batch size to 4?

Answer 2 · 2023-04-10T17:19:25.000Z

the hyperparameter is set bs=1 for 16 V100 so I set it to 2. I will try 4.