/torchcv_edit

Primary LanguagePythonMIT LicenseMIT

TorchCV: a PyTorch vision library mimics ChainerCV

Detection

Model Original Paper ChainerCV TorchCV*
SSD300@voc07_test 74.3% 77.8% 76.68%
SSD512@voc07_test 76.8% 79.2% 78.89%
FPNSSD512@voc07_test - - 81.46%

The accuracy of TorchCV SSD is ~1% lower than ChainerCV. This is because the VGG base model I use performs slightly worse.
I did the experiment by replacing pytorch/vision VGG16 model with the model used in ChainerCV, the SSD512 model got 79.85% accuracy.

FPNSSD512 is created by replacing SSD VGG16 network with FPN50, the rest is the same. It beats all SSD models.
You can download the trained params here.

Update

[2018-2-6] Our FPNSSD512 model achieved the 1st place on the PASCAL VOC 2012 dataset.

image Check the leaderboard.

[2018-2-26] As issue(#11) mentioned I shouldn't use VOC07 data for training. I submit another result that is only trained on VOC12 data. The older submission is already marked to private.

image

[2018-3-29] As Alibaba Turing Lab submit a result of 74.8% MAP, which takes the first place on Comp3, I decided to train a deeper model (replace FPN50 with FPN152, trained only with VOC12 data).
It got MAP of 77%, which is far more higher than I expected.
Check the new leaderboard. The older submission is marked to private.

TODO

  • SSD300
  • SSD512
  • FPNSSD512
  • RetinaNet
  • Faster R-CNN
  • Mask R-CNN