/CV-pretrained-model

A collection of computer vision pre-trained models.

MIT LicenseMIT

Maintenance GitHub GitHub GitHub

Computer Vision Pretrained Models

CV logo

What is pre-trained Model?

A pre-trained model is a model created by some one else to solve a similar problem. Instead of building a model from scratch to solve a similar problem, we can use the model trained on other problem as a starting point. A pre-trained model may not be 100% accurate in your application.

For example, if you want to build a self learning car. You can spend years to build a decent image recognition algorithm from scratch or you can take inception model (a pre-trained model) from Google which was built on ImageNet data to identify images in those pictures.

Other Pre-trained Models

Model Deployment library

Framework

Model visualization

You can see visualizations of each model's network architecture by using Netron.

CV logo

Tensorflow

Model Name Description Framework License
ObjectDetection Localizing and identifying multiple objects in a single image. Tensorflow Apache License
Mask R-CNN The model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone. Tensorflow The MIT License (MIT)
Faster-RCNN This is an experimental Tensorflow implementation of Faster RCNN - a convnet for object detection with a region proposal network. Tensorflow MIT License
YOLO TensorFlow This is tensorflow implementation of the YOLO:Real-Time Object Detection. Tensorflow Custom
YOLO TensorFlow ++ TensorFlow implementation of 'YOLO: Real-Time Object Detection', with training and an actual support for real-time running on mobile devices. Tensorflow GNU GENERAL PUBLIC LICENSE
MobileNet MobileNets trade off between latency, size and accuracy while comparing favorably with popular models from the literature. Tensorflow The MIT License (MIT)
DeepLab Deep labeling for semantic image segmentation. Tensorflow Apache License
Colornet Neural Network to colorize grayscale images. Tensorflow Not Found
SRGAN Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. Tensorflow Not Found
DeepOSM Train TensorFlow neural nets with OpenStreetMap features and satellite imagery. Tensorflow The MIT License (MIT)
Domain Transfer Network Implementation of Unsupervised Cross-Domain Image Generation. Tensorflow MIT License
Show, Attend and Tell Attention Based Image Caption Generator. Tensorflow MIT License
android-yolo Real-time object detection on Android using the YOLO network, powered by TensorFlow. Tensorflow Apache License
DCSCN Super Resolution This is a tensorflow implementation of "Fast and Accurate Image Super Resolution by Deep CNN with Skip Connection and Network in Network", a deep learning based Single-Image Super-Resolution (SISR) model. Tensorflow Not Found
GAN-CLS This is an experimental tensorflow implementation of synthesizing images. Tensorflow Not Found
U-Net For Brain Tumor Segmentation. Tensorflow Not Found
Improved CycleGAN Unpaired Image to Image Translation. Tensorflow MIT License
Im2txt Image-to-text neural network for image captioning. Tensorflow Apache License
SLIM Image classification models in TF-Slim. Tensorflow Apache License
DELF Deep local features for image matching and retrieval. Tensorflow Apache License
Compression Compressing and decompressing images using a pre-trained Residual GRU network. Tensorflow Apache License
AttentionOCR A model for real-world image text extraction. Tensorflow Apache License
↥ Back To Top

Keras

Model Name Description Framework License
Mask R-CNN The model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone. Keras The MIT License (MIT)
VGG16 Very Deep Convolutional Networks for Large-Scale Image Recognition. Keras The MIT License (MIT)
VGG19 Very Deep Convolutional Networks for Large-Scale Image Recognition. Keras The MIT License (MIT)
ResNet Deep Residual Learning for Image Recognition. Keras The MIT License (MIT)
ResNet50 Deep Residual Learning for Image Recognition. Keras The MIT License (MIT)
Nasnet NASNet refers to Neural Architecture Search Network, a family of models that were designed automatically by learning the model architectures directly on the dataset of interest. Keras The MIT License (MIT)
MobileNet MobileNet v1 models for Keras. Keras The MIT License (MIT)
MobileNet V2 MobileNet v2 models for Keras. Keras The MIT License (MIT)
MobileNet V3 MobileNet v3 models for Keras. Keras The MIT License (MIT)
efficientnet Rethinking Model Scaling for Convolutional Neural Networks. Keras The MIT License (MIT)
Image analogies Generate image analogies using neural matching and blending. Keras The MIT License (MIT)
Popular Image Segmentation Models Implementation of Segnet, FCN, UNet and other models in Keras. Keras MIT License
Ultrasound nerve segmentation This tutorial shows how to use Keras library to build deep neural network for ultrasound image nerve segmentation. Keras MIT License
DeepMask object segmentation This is a Keras-based Python implementation of DeepMask- a complex deep neural network for learning object segmentation masks. Keras Not Found
Monolingual and Multilingual Image Captioning This is the source code that accompanies Multilingual Image Description with Neural Sequence Models. Keras BSD-3-Clause License
pix2pix Keras implementation of Image-to-Image Translation with Conditional Adversarial Networks by Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Keras Not Found
Colorful Image colorization B&W to color. Keras Not Found
CycleGAN Implementation of Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. Keras MIT License
DualGAN Implementation of DualGAN: Unsupervised Dual Learning for Image-to-Image Translation. Keras MIT License
Super-Resolution GAN Implementation of Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. Keras MIT License
↥ Back To Top

PyTorch

Model Name Description Framework License
detectron2 Detectron2 is Facebook AI Research's next generation software system that implements state-of-the-art object detection algorithms PyTorch Apache License 2.0
FastPhotoStyle A Closed-form Solution to Photorealistic Image Stylization. PyTorch Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public Licens
pytorch-CycleGAN-and-pix2pix A Closed-form Solution to Photorealistic Image Stylization. PyTorch BSD License
maskrcnn-benchmark Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch. PyTorch MIT License
deep-image-prior Image restoration with neural networks but without learning. PyTorch Apache License 2.0
StarGAN StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation. PyTorch MIT License
faster-rcnn.pytorch This project is a faster faster R-CNN implementation, aimed to accelerating the training of faster R-CNN object detection models. PyTorch MIT License
pix2pixHD Synthesizing and manipulating 2048x1024 images with conditional GANs. PyTorch BSD License
Augmentor Image augmentation library in Python for machine learning. PyTorch MIT License
albumentations Fast image augmentation library. PyTorch MIT License
Deep Video Analytics Deep Video Analytics is a platform for indexing and extracting information from videos and images PyTorch Custom
semantic-segmentation-pytorch Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset. PyTorch BSD 3-Clause License
An End-to-End Trainable Neural Network for Image-based Sequence Recognition This software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC loss for image-based sequence recognition tasks, such as scene text recognition and OCR. PyTorch The MIT License (MIT)
UNIT PyTorch Implementation of our Coupled VAE-GAN algorithm for Unsupervised Image-to-Image Translation. PyTorch Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License
Neural Sequence labeling model Sequence labeling models are quite popular in many NLP tasks, such as Named Entity Recognition (NER), part-of-speech (POS) tagging and word segmentation. PyTorch Apache License
faster rcnn This is a PyTorch implementation of Faster RCNN. This project is mainly based on py-faster-rcnn and TFFRCNN. For details about R-CNN please refer to the paper Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks by Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun. PyTorch MIT License
pytorch-semantic-segmentation PyTorch for Semantic Segmentation. PyTorch MIT License
EDSR-PyTorch PyTorch version of the paper 'Enhanced Deep Residual Networks for Single Image Super-Resolution'. PyTorch MIT License
image-classification-mobile Collection of classification models pretrained on the ImageNet-1K. PyTorch MIT License
FaderNetworks Fader Networks: Manipulating Images by Sliding Attributes - NIPS 2017. PyTorch Creative Commons Attribution-NonCommercial 4.0 International Public License
neuraltalk2-pytorch Image captioning model in pytorch (finetunable cnn in branch with_finetune). PyTorch MIT License
RandWireNN Implementation of: "Exploring Randomly Wired Neural Networks for Image Recognition". PyTorch Not Found
stackGAN-v2 Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++. PyTorch MIT License
Detectron models for Object Detection This code allows to use some of the Detectron models for object detection from Facebook AI Research with PyTorch. PyTorch Apache License
DEXTR-PyTorch This paper explores the use of extreme points in an object (left-most, right-most, top, bottom pixels) as input to obtain precise object segmentation for images and videos. PyTorch GNU GENERAL PUBLIC LICENSE
pointnet.pytorch Pytorch implementation for "PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. PyTorch MIT License
self-critical.pytorch This repository includes the unofficial implementation Self-critical Sequence Training for Image Captioning and Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering. PyTorch MIT License
vnet.pytorch A Pytorch implementation for V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. PyTorch BSD 3-Clause License
piwise Pixel-wise segmentation on VOC2012 dataset using pytorch. PyTorch BSD 3-Clause License
pspnet-pytorch PyTorch implementation of PSPNet segmentation network. PyTorch Not Found
pytorch-SRResNet Pytorch implementation for Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. PyTorch The MIT License (MIT)
PNASNet.pytorch PyTorch implementation of PNASNet-5 on ImageNet. PyTorch Apache License
img_classification_pk_pytorch Quickly comparing your image classification models with the state-of-the-art models. PyTorch Not Found
Deep Neural Networks are Easily Fooled High Confidence Predictions for Unrecognizable Images. PyTorch MIT License
pix2pix-pytorch PyTorch implementation of "Image-to-Image Translation Using Conditional Adversarial Networks". PyTorch Not Found
NVIDIA/semantic-segmentation A PyTorch Implementation of Improving Semantic Segmentation via Video Propagation and Label Relaxation, In CVPR2019. PyTorch CC BY-NC-SA 4.0 license
Neural-IMage-Assessment A PyTorch Implementation of Neural IMage Assessment. PyTorch Not Found
torchxrayvision Pretrained models for chest X-ray (CXR) pathology predictions. Medical, Healthcare, Radiology PyTorch Apache License
pytorch-image-models PyTorch image models, scripts, pretrained weights -- (SE)ResNet/ResNeXT, DPN, EfficientNet, MixNet, MobileNet-V3/V2, MNASNet, Single-Path NAS, FBNet, and more PyTorch Apache License 2.0
↥ Back To Top

Caffe

Model Name Description Framework License
OpenPose OpenPose represents the first real-time multi-person system to jointly detect human body, hand, and facial keypoints (in total 130 keypoints) on single images. Caffe Custom
Fully Convolutional Networks for Semantic Segmentation Fully Convolutional Models for Semantic Segmentation. Caffe Not Found
Colorful Image Colorization Colorful Image Colorization. Caffe BSD-2-Clause License
R-FCN R-FCN: Object Detection via Region-based Fully Convolutional Networks. Caffe MIT License
cnn-vis Inspired by Google's recent Inceptionism blog post, cnn-vis is an open-source tool that lets you use convolutional neural networks to generate images. Caffe The MIT License (MIT)
DeconvNet Learning Deconvolution Network for Semantic Segmentation. Caffe Custom
↥ Back To Top

MXNet

Model Name Description Framework License
Faster RCNN Region Proposal Network solves object detection as a regression problem. MXNet Apache License, Version 2.0
SSD SSD is an unified framework for object detection with a single network. MXNet MIT License
Faster RCNN+Focal Loss The code is unofficial version for focal loss for Dense Object Detection. MXNet Not Found
CNN-LSTM-CTC I realize three different models for text recognition, and all of them consist of CTC loss layer to realize no segmentation for text images. MXNet Not Found
Faster_RCNN_for_DOTA This is the official repo of paper DOTA: A Large-scale Dataset for Object Detection in Aerial Images. MXNet Apache License
RetinaNet Focal loss for Dense Object Detection. MXNet Not Found
MobileNetV2 This is a MXNet implementation of MobileNetV2 architecture as described in the paper Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation. MXNet Apache License
neuron-selectivity-transfer This code is a re-implementation of the imagenet classification experiments in the paper Like What You Like: Knowledge Distill via Neuron Selectivity Transfer. MXNet Apache License
MobileNetV2 This is a Gluon implementation of MobileNetV2 architecture as described in the paper Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation. MXNet Apache License
sparse-structure-selection This code is a re-implementation of the imagenet classification experiments in the paper Data-Driven Sparse Structure Selection for Deep Neural Networks. MXNet Apache License
FastPhotoStyle A Closed-form Solution to Photorealistic Image Stylization. MXNet Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License
↥ Back To Top

Contributions

Your contributions are always welcome!! Please have a look at contributing.md

License

MIT License