Reproduces ResNet-V3 (Aggregated Residual Transformations for Deep Neural Networks) with pytorch.
- Trains on Cifar10 and Cifar100
- Upload Cifar Training Curves
- Upload Cifar Trained Models
- Pytorch 4.0
- Train Imagenet
git clone https://github.com/prlz77/resnext.pytorch
cd resnext.pytorch
git checkout R4.0 # R3.0 for backwards compatibility.
To train on Cifar-10 using 2 gpu:
python train.py ~/DATASETS/cifar.python cifar10 -s ./snapshots --log ./logs --ngpu 2 --learning_rate 0.05 -b 128
It should reach ~3.65% on Cifar-10, and ~17.77% on Cifar-100.
After train phase, you can check saved model.
Thanks to @AppleHolic we have now a test script:
To test on Cifar-10 using 2 gpu:
python test.py ~/DATASETS/cifar.python cifar10 --ngpu 2 --load ./snapshots/model.pytorch --test_bs 128
From the original paper:
cardinality | base_width | parameters | Error cifar10 | error cifar100 | default |
---|---|---|---|---|---|
8 | 64 | 34.4M | 3.65 | 17.77 | x |
16 | 64 | 68.1M | 3.58 | 17.31 |
Update: widen_factor
has been disentangled from base_width
because it was confusing. Now widen factor is set to consant 4, and base_width
is the same as in the original paper.
Link to trained models corresponding to the following curves:
Update: several commits have been pushed after training the models in Mega, so it is recommended to revert to e10c37d8cf7a958048bc0f58cd86c3e8ac4e707d
- torch (@facebookresearch). (Original) Cifar and Imagenet
- caffe (@terrychenism). Imagenet
- MXNet (@dmlc). Imagenet
@article{xie2016aggregated,
title={Aggregated residual transformations for deep neural networks},
author={Xie, Saining and Girshick, Ross and Doll{\'a}r, Piotr and Tu, Zhuowen and He, Kaiming},
journal={arXiv preprint arXiv:1611.05431},
year={2016}
}