Time for training a new style

Question

Time for training a new style

nitish116 opened this issue 7 years ago · 3 comments

Hi,
I am using p2.xlarge in AWS (NVIDIA K80 GPU, 61GB RAM , 128GB Space). To check it out I was using COCO's val2017(5K images). It downloaded the vgg-model file and it has been blank for a very long while.
0.How much time does it take to get trained for 5000 images?
1.Also when I am trying to kill the python process with "kill -9 PID", it is not getting killed, is it because of the multi threading ? How do I end the training process in between ?

--> I am able to run 'eval' option using cuda 1 and it takes about 1-2 seconds to generate output images. So I am assuming all the package installations are fine on my side.

Please help me resolve the issues.

Answer 1 · 2017-09-21T13:36:18.000Z

It should be able train quite quickly on 5000 images, if you want to see if it starts training or not modify the --log-interval [code] parameter to 1 and see if output messages are printed. Also run the training command using the unbuffered parameter position: python -u train.py..., you can test training with a batch size of 1 on your local laptop/desktop to check if everything is working correctly before running on AWS.

Answer 2 · 2017-09-27T21:48:43.000Z

@abhiskk How many images should we use when training a new model?
Is it worth it to go over 10K, 20K, 80K images?

Answer 3 · 2017-09-27T21:50:27.000Z

@glesperance you can get good results with around 10K images. I would also suggest to monitor the loss to get good quality results.