The net that python script created is diffrent from any pre-trained model offered in this README.md?

Question

The net that python script created is diffrent from any pre-trained model offered in this README.md?

wangzhe0623 opened this issue 7 years ago · 11 comments

Hi, I was trying to fine tune a pre-trained model with my dataset and I need to change number of classes from 21 to 2. So I planed to modify the python script instead of making it in prototxt files. But I found the model python script created is about "28.6M", which is different from anyone offered in this repository. If I want to train a model with 2 classes, I should train it without pre-trained model?

Many thanks!

Answer 1 · 2017-08-24T02:33:50.000Z

And, I tried to train the model without pre-trained model, the "detection_eval" increased slowly, which is about "0.2" when the "Iteration 60000"

Answer 2 · 2017-08-24T05:05:47.000Z

Hi @wangzhe0623,

The inconsistent model size is caused by the final classification layers (different numbers of category). We adopted the same network structure for VOC and coco but obtained totally different model size (59.2M vs. 87.2M). If you want to use our pre-trained model, use this script (https://github.com/weiliu89/caffe/blob/ssd/examples/convert_model.ipynb) provided by SSD's author to convert our model to your data, or simply rename the prediction layers.
I think #7 will help you.

Answer 3 · 2017-08-24T06:06:59.000Z

@szq0214 Thanks a lot~ But I think I didn't make it clear in last comment. I have converted my dataset to VOC style, and "test_name_size", "labelmap" are also OK. I changed "num_classes" in "DSOD300_pascal.py". then run it without pre-trained model. Now the "detection_eval" is slower after 70000 iterations, it's lower than 0.2~ Is that normal? Should I continue training this model or stop to check?
Thanks again!

Answer 4 · 2017-08-24T07:02:48.000Z

Hi @wangzhe0623, It's not normal with so poor performance. I think you should stop it and try other parameters. BTW, what "accum_batch_size" number do you use?

Answer 5 · 2017-08-24T08:07:49.000Z

@szq0214 I set "accum_batch_size" 8, and "batch_size" is 8, either.

Answer 6 · 2017-08-24T08:24:30.000Z

@wangzhe0623 "accum_batch_size" is too small, try 128 or other larger number.

Answer 7 · 2017-08-24T08:51:09.000Z

@szq0214 OK，I modified iter_size to 8 in "solver.protxt" instead, should I restart all over again OR fine tune the model with low precision? So much thanks~

Answer 8 · 2017-08-24T09:07:25.000Z

@wangzhe0623 I recommend restarting from scratch.

Answer 9 · 2017-08-24T09:17:35.000Z

@szq0214 OK, I will try it. Thanks a lot!

Answer 10 · 2017-08-25T03:34:35.000Z

@szq0214 It worked. "iter_size" is quite importent in training this kind of net, I used to ignore it in the past. So many many many thanks~

Answer 11 · 2017-08-25T04:11:48.000Z

@wangzhe0623 :)