Training (finetuning lip) is too slow.

Question

Training (finetuning lip) is too slow.

Opened this issue 4 months ago · 0 comments

Hi.
First of all, I appreciate you to share this great work.

I am trying to train the model from scratch on the dataset of Theresa May.
The coarse stage of training went to work well, and it was quite fast (~40it/s, total elapsed time was about 30 minutes).

The problem is fine stage. As seen in the figure, it takes more than a second to take a single step, and GPU util is quite low (sometimes ~40%, but mostly 0%).
I tried to find where the bottleneck is, but I couldn't figure out.

My machine is RTX 4090, and I checked there was no memory / VRAM shortage during training.

I wonder why the fine stage is much slower than coarse stage. Is it normal to have slower training speed in fine stage, or is there something wrong in my environment?

Thank you.