What's the diffience in performance between this new code you pushed and the previous code?
Sundrops opened this issue · 6 comments
Hi Wei Liu,
Thanks for sharing the code! I saw you pushed some new code a few days ago. For example, expand_param was added into ssd_pascal.py .(and ssd_pascal_500.py was deleted....). Does the new code outperform previous ? And I also saw you uploaded more pretrained model. Which one should I choose to finetune on my dataset(pictures captured by monitor ).I want to detect the head of pedestrians. Can you give me some advice ? Thanks a lot !
You can check README.md for more results. It outperforms previous model by 5+ points (e.g. 77.* vs. 72.* for SSD300 on VOC07 test). We have an updated arXiv paper describing the details. To summarize, here are some key changes:
- More data augmentation ==> expansion trick and color distortion
- Not clipping the default boxes on the border.
- Change hard negatives mining based on confidence loss instead of background class score.
- Remove pool6 and change it to convolution. And readjust the extra layers a bit.
- Specify a constant step when generating prior box.
The COCO models should have the most accurate bbox prediction. You can take the location prediction from a COCO model, and then add classification prediction layer for your classes. You can check examples/convert_model.ipynb
on how we convert a pretrained model to VOC model.
Thanks for your detailed answer. The COCO models you said means VGG_coco_SSD_512x512_iter_360000.caffemodel(81 class) or VGG_coco_SSD512x512.caffemodel(21 class) ?And I only have one GPU, how to change the batch_size ?(batch_size is 32 in your code for 4 GPUS)
Oh I see. VGG_coco_SSD512x512.caffemodel is converted from VGG_coco_SSD_512x512_iter_360000.caffemodel. But my dataset is totally different from COCO.There is no common label. Can I straight finetune the VGG_coco_SSD_512x512_iter_360000 on my dataset without converting ?
Hi, Wei. Recently, I trained with your modified code again and compared with my previous models.
But, the results were
-
accuracy : the new version better than previous one.
-
train time & inference time : the new version much slower than the previous one.
train time per 10 iter :
-- new version : 25s vs. the previous one : 6s
infernce time (s/img avg) :
-- new version : 19ms vs. the previous one : 14ms
I wonder which one parts cause additional processing.
Hi
About new training.
I try to do some test with caffe time (on train.prototxt)
- New vs Old caffe-ssd on old prototxt is 0.81x (New code is faster)
- New prototxt vs Old prototxt is 2.4x
- Without augmentation New vs Old is 1.4x
- Without augmentation and expand 1.05x
- No augm no expand no mining 1x
- With 0.2 prob (vs 0.5 prob) in all augm and expand 1.4x
Mining in caffe time not so greedy
But in practice new training is slower 3-4x times.
Then i have question can I use pretrained version of old COCO? and then i want to train on my own dataset
Or i need train it with steps and offsets and no clip?
Or i need turn on all new features?
The current code might not be fully optimized for some preprocessing steps and some postprocessing steps. Feel free to send a pull request if you have a fix.