Confusion about training time?

Question

Confusion about training time?

rightchose opened this issue 3 years ago · 7 comments

How long it takes to train the model one epoch?

Answer 1 · 2021-08-20T09:04:19.000Z

Thanks for your interest! It takes about 4 hours/epoch in the first stage and 2 hours/epoch in the third stage in our experimental environments.

Answer 2 · 2021-08-20T10:38:56.000Z

Thanks for your answer! But if the settting of epoch in first stage is set to 30(in paper) or 100(in code default), it may be too long. So in pratice, did you have some ways to reduce the time or quickly to get feedback?

Answer 3 · 2021-08-20T12:08:52.000Z

Thanks for your answer! But if the setting of epoch in first stage is set to 30(in paper) or 100(in code default), it may be too long. So in practice, did you have some ways to reduce the time or quickly to get feedback?

It is surely time consuming for training the full model with only 2x11G GPUs. For faster training, you could try larger batch size and tuning the learning rate hyper-parameters if more computational resources are available. In addition, strategies such as training smaller models (reduce the channel numbers to half) or training on smaller dataset (use a subset of the full dataset) are recommended.

Answer 4 · 2021-08-20T13:33:56.000Z

Thanks a lot! I will try it.

Answer 5 · 2021-08-31T07:27:48.000Z

I'm now training the model in the stage 3, so I want to know in stage3 how many epoch you get the model convergent?

Answer 6 · 2021-08-31T07:51:59.000Z

You could refer to this question. If you are equipped with more computational resources, larger resolution cropping and larger batch sizes are strongly recommended to accelerate training. And the learning rate could be adjusted accordingly.

Answer 7 · 2021-08-31T12:14:00.000Z

Thanks！ I will try this solutions.