这是一个图像分类相关的代码集合,目标是把自己感兴趣的方法整合到一个统一的框架里,方便以后复现别人的方法以及验证自己的想法。本工程包括以下几个部分:
- ResNet: He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778. 经典残差网络
- VGGNet: Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014.
- MobileNet: Howard A G, Zhu M, Chen B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv:1704.04861, 2017.
- SENet: Hu, J., et al. (2019). "Squeeze-and-Excitation Networks." arXiv pre-print server. 通道注意力机制中的经典方法
- CBAM: Woo, S., et al. (2018). "CBAM: Convolutional Block Attention Module." arXiv pre-print server. 通道和空间注意力机制
- Non-local: Wang, X., et al. (2018). "Non-local Neural Networks." arXiv pre-print server. 自注意力机制的经典方法
- Coordinate Attention: Hou, Q., et al. (2021). "Coordinate Attention for Efficient Mobile Network Design." arXiv pre-print server.
- softmax cross-entropy loss
- arcface loss: Deng J, Guo J, Xue N, et al. Arcface: Additive angular margin loss for deep face recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 4690-4699.
- center loss: Wen Y, Zhang K, Li Z, et al. A discriminative feature learning approach for deep face recognition[C]//European conference on computer vision. 2016: 499-515
- cutmix: Yun, S., et al. (2019). "CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features." arXiv pre-print server.
- mixup: Zhang H, Cisse M, Dauphin Y N, et al. mixup: Beyond empirical risk minimization[J]. arXiv preprint arXiv:1710.09412, 2017.
- AutoAugment: Cubuk E D, Zoph B, Mane D, et al. Autoaugment: Learning augmentation policies from data[EB/OL]. [2020-01-01]. http://arxiv.org/abs/1805.09501.
- Zhou B, Cui Q, Wei X S, et al. Bbn: Bilateral-branch network with cumulative learning for long-tailed visual recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 9719-9728.
- Kang B, Xie S, Rohrbach M, et al. Decoupling representation and classifier for long-tailed recognition[J]. arXiv preprint arXiv:1910.09217, 2019.
- Lin T Y, RoyChowdhury A, Maji S. Bilinear cnn models for fine-grained visual recognition[C]//Proceedings of the IEEE international conference on computer vision. 2015: 1449-1457
- WSDAN: Hu T, Qi H, Huang Q, et al. See better before looking closer: Weakly supervised data augmentation network for fine-grained visual classification[J]. arXiv preprint arXiv:1901.09891, 2019
- Zhou B, Khosla A, Lapedriza A, et al. Learning deep features for discriminative localization[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 2921-2929.
- Selvaraju R R, Cogswell M, Das A, et al. Grad-cam: Visual explanations from deep networks via gradient-based localization[C]//Proceedings of the IEEE international conference on computer vision. 2017: 618-626.
Backbone | Attn | Loss | Top1 Acc |
---|---|---|---|
ResNet50 | None | label smothing softmax ce | 87.159% |
ResNet50 | None | softmax ce | 86.020% |