/PyTorchCV

A PyTorch-Based Framework for Deep Learning in Computer Vision

Primary LanguageShellApache License 2.0Apache-2.0

PyTorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

@misc{CV2018,
  author =       {Donny You},
  howpublished = {\url{https://github.com/CVBox/PyTorchCV}},
  year =         {2018}
}

This repository provides source code for some deep learning based cv problems. We'll do our best to keep this repository up to date. If you do find a problem about this repository, please raise it as an issue. We will fix it immediately.

Implemented Papers

  • Image Classification

    • VGG: Very Deep Convolutional Networks for Large-Scale Image Recognition
    • ResNet: Deep Residual Learning for Image Recognition
    • DenseNet: Densely Connected Convolutional Networks
    • MobileNetV2: Inverted Residuals and Linear Bottlenecks
    • ResNeXt: Aggregated Residual Transformations for Deep Neural Networks
    • SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size
    • ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
    • ShuffleNet V2: Practical Guidelines for Ecient CNN Architecture Design
  • Pose Estimation

    • CPM: Convolutional Pose Machines
    • OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
  • Object Detection

    • SSD: Single Shot MultiBox Detector
    • Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
    • YOLOv3: An Incremental Improvement
    • FPN: Feature Pyramid Networks for Object Detection
  • Semantic Segmentation

    • DeepLabV3: Rethinking Atrous Convolution for Semantic Image Segmentation
    • PSPNet: Pyramid Scene Parsing Network
    • DenseASPP: DenseASPP for Semantic Segmentation in Street Scenes
  • Instance Segmentation

    • Mask R-CNN

Performances with PyTorchCV

Object Detection

Model Training data Testing data mAP FPS
SSD-300 Origin VOC07+12 trainval VOC07 test 0.772 -
SSD-300 Ours VOC07+12 trainval VOC07 test 0.786 -
SSD-512 Origin VOC07+12 trainval VOC07 test 0.798 -
SSD-512 Ours VOC07+12 trainval VOC07 test 0.807 -
  • Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [Faster R-CNN]
Model Training data Testing data mAP FPS
Faster R-CNN Origin VOC07 trainval VOC07 test 0.699 -
Faster R-CNN Ours VOC07 trainval VOC07 test 0.706 -
  • YOLOv3: An Incremental Improvement

Commands with PyTorchCV

Take OpenPose as an example.

  • Train the openpose model
python main.py  --hypes hypes/pose/coco/op_coco_pose.json \
                --base_lr 0.001 \
                --phase train \
                --gpu 0 1
  • Finetune the openpose model
python main.py  --hypes hypes/pose/coco/op_coco_pose.json \
                --base_lr 0.001 \
                --phase train \
                --resume checkpoints/pose/coco/coco_open_pose_65000.pth \
                --gpu 0 1
  • Test the openpose model(test_img):
python main.py  --hypes hypes/pose/coco/op_coco_pose.json \
                --phase test \
                --resume checkpoints/pose/coco/coco_open_pose_65000.pth \
                --test_img val/samples/ski.jpg \
                --gpu 0
  • Test the openpose model(test_dir):
python main.py  --hypes hypes/pose/coco/op_coco_pose.json \
                --phase test \
                --resume checkpoints/pose/coco/coco_open_pose_65000.pth \
                --test_dir val/samples \
                --gpu 0

Examples with PyTorchCV

Example output of VGG19-OpenPose

Example output of VGG19-OpenPose