weiliu89/caffe

What's the diffience in performance between this new code you pushed and the previous code?

Sundrops opened this issue · 6 comments

Hi Wei Liu,

Thanks for sharing the code! I saw you pushed some new code a few days ago. For example, expand_param was added into ssd_pascal.py .(and ssd_pascal_500.py was deleted....). Does the new code outperform previous ? And I also saw you uploaded more pretrained model. Which one should I choose to finetune on my dataset(pictures captured by monitor ).I want to detect the head of pedestrians. Can you give me some advice ? Thanks a lot !

You can check README.md for more results. It outperforms previous model by 5+ points (e.g. 77.* vs. 72.* for SSD300 on VOC07 test). We have an updated arXiv paper describing the details. To summarize, here are some key changes:

  1. More data augmentation ==> expansion trick and color distortion
  2. Not clipping the default boxes on the border.
  3. Change hard negatives mining based on confidence loss instead of background class score.
  4. Remove pool6 and change it to convolution. And readjust the extra layers a bit.
  5. Specify a constant step when generating prior box.

The COCO models should have the most accurate bbox prediction. You can take the location prediction from a COCO model, and then add classification prediction layer for your classes. You can check examples/convert_model.ipynb on how we convert a pretrained model to VOC model.

Thanks for your detailed answer. The COCO models you said means VGG_coco_SSD_512x512_iter_360000.caffemodel(81 class) or VGG_coco_SSD512x512.caffemodel(21 class) ?And I only have one GPU, how to change the batch_size ?(batch_size is 32 in your code for 4 GPUS)

Oh I see. VGG_coco_SSD512x512.caffemodel is converted from VGG_coco_SSD_512x512_iter_360000.caffemodel. But my dataset is totally different from COCO.There is no common label. Can I straight finetune the VGG_coco_SSD_512x512_iter_360000 on my dataset without converting ?

@weiliu89

Hi, Wei. Recently, I trained with your modified code again and compared with my previous models.

But, the results were

  1. accuracy : the new version better than previous one.

  2. train time & inference time : the new version much slower than the previous one.

train time per 10 iter :
-- new version : 25s vs. the previous one : 6s

infernce time (s/img avg) :

-- new version : 19ms vs. the previous one : 14ms

I wonder which one parts cause additional processing.

Hi
About new training.
I try to do some test with caffe time (on train.prototxt)

  1. New vs Old caffe-ssd on old prototxt is 0.81x (New code is faster)
  2. New prototxt vs Old prototxt is 2.4x
  3. Without augmentation New vs Old is 1.4x
  4. Without augmentation and expand 1.05x
  5. No augm no expand no mining 1x
  6. With 0.2 prob (vs 0.5 prob) in all augm and expand 1.4x
    Mining in caffe time not so greedy
    But in practice new training is slower 3-4x times.

Then i have question can I use pretrained version of old COCO? and then i want to train on my own dataset
Or i need train it with steps and offsets and no clip?
Or i need turn on all new features?

The current code might not be fully optimized for some preprocessing steps and some postprocessing steps. Feel free to send a pull request if you have a fix.