Computer Vision Pretrained Models

What is pre-trained Model?

A pre-trained model is a model created by some one else to solve a similar problem. Instead of building a model from scratch to solve a similar problem, we can use the model trained on other problem as a starting point. A pre-trained model may not be 100% accurate in your application.

For example, if you want to build a self learning car. You can spend years to build a decent image recognition algorithm from scratch or you can take inception model (a pre-trained model) from Google which was built on ImageNet data to identify images in those pictures.

Other Pre-trained Models

Model Deployment library

Model Serving

Model visualization

You can see visualizations of each model's network architecture by using Netron.

Tensorflow

Model Name	Description	Framework	License
ObjectDetection	Localizing and identifying multiple objects in a single image.	`Tensorflow`	Apache License
Mask R-CNN	The model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone.	`Tensorflow`	The MIT License (MIT)
Faster-RCNN	This is an experimental Tensorflow implementation of Faster RCNN - a convnet for object detection with a region proposal network.	`Tensorflow`	MIT License
YOLO TensorFlow	This is tensorflow implementation of the YOLO:Real-Time Object Detection.	`Tensorflow`	Custom
YOLO TensorFlow ++	TensorFlow implementation of 'YOLO: Real-Time Object Detection', with training and an actual support for real-time running on mobile devices.	`Tensorflow`	GNU GENERAL PUBLIC LICENSE
MobileNet	MobileNets trade off between latency, size and accuracy while comparing favorably with popular models from the literature.	`Tensorflow`	The MIT License (MIT)
DeepLab	Deep labeling for semantic image segmentation.	`Tensorflow`	Apache License
Colornet	Neural Network to colorize grayscale images.	`Tensorflow`	Not Found
SRGAN	Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network.	`Tensorflow`	Not Found
DeepOSM	Train TensorFlow neural nets with OpenStreetMap features and satellite imagery.	`Tensorflow`	The MIT License (MIT)
Domain Transfer Network	Implementation of Unsupervised Cross-Domain Image Generation.	`Tensorflow`	MIT License
Show, Attend and Tell	Attention Based Image Caption Generator.	`Tensorflow`	MIT License
android-yolo	Real-time object detection on Android using the YOLO network, powered by TensorFlow.	`Tensorflow`	Apache License
DCSCN Super Resolution	This is a tensorflow implementation of "Fast and Accurate Image Super Resolution by Deep CNN with Skip Connection and Network in Network", a deep learning based Single-Image Super-Resolution (SISR) model.	`Tensorflow`	Not Found
GAN-CLS	This is an experimental tensorflow implementation of synthesizing images.	`Tensorflow`	Not Found
U-Net	For Brain Tumor Segmentation.	`Tensorflow`	Not Found
Improved CycleGAN	Unpaired Image to Image Translation.	`Tensorflow`	MIT License
Im2txt	Image-to-text neural network for image captioning.	`Tensorflow`	Apache License
SLIM	Image classification models in TF-Slim.	`Tensorflow`	Apache License
DELF	Deep local features for image matching and retrieval.	`Tensorflow`	Apache License
Compression	Compressing and decompressing images using a pre-trained Residual GRU network.	`Tensorflow`	Apache License
AttentionOCR	A model for real-world image text extraction.	`Tensorflow`	Apache License

↥ Back To Top

Keras

Model Name	Description	Framework	License
Mask R-CNN	The model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone.	`Keras`	The MIT License (MIT)
VGG16	Very Deep Convolutional Networks for Large-Scale Image Recognition.	`Keras`	The MIT License (MIT)
VGG19	Very Deep Convolutional Networks for Large-Scale Image Recognition.	`Keras`	The MIT License (MIT)
ResNet	Deep Residual Learning for Image Recognition.	`Keras`	The MIT License (MIT)
ResNet50	Deep Residual Learning for Image Recognition.	`Keras`	The MIT License (MIT)
Nasnet	NASNet refers to Neural Architecture Search Network, a family of models that were designed automatically by learning the model architectures directly on the dataset of interest.	`Keras`	The MIT License (MIT)
MobileNet	MobileNet v1 models for Keras.	`Keras`	The MIT License (MIT)
MobileNet V2	MobileNet v2 models for Keras.	`Keras`	The MIT License (MIT)
MobileNet V3	MobileNet v3 models for Keras.	`Keras`	The MIT License (MIT)
Image analogies	Generate image analogies using neural matching and blending.	`Keras`	The MIT License (MIT)
Popular Image Segmentation Models	Implementation of Segnet, FCN, UNet and other models in Keras.	`Keras`	MIT License
Ultrasound nerve segmentation	This tutorial shows how to use Keras library to build deep neural network for ultrasound image nerve segmentation.	`Keras`	MIT License
DeepMask object segmentation	This is a Keras-based Python implementation of DeepMask- a complex deep neural network for learning object segmentation masks.	`Keras`	Not Found
Monolingual and Multilingual Image Captioning	This is the source code that accompanies Multilingual Image Description with Neural Sequence Models.	`Keras`	BSD-3-Clause License
pix2pix	Keras implementation of Image-to-Image Translation with Conditional Adversarial Networks by Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A.	`Keras`	Not Found
Colorful Image colorization	B&W to color.	`Keras`	Not Found
CycleGAN	Implementation of Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks.	`Keras`	MIT License
DualGAN	Implementation of DualGAN: Unsupervised Dual Learning for Image-to-Image Translation.	`Keras`	MIT License
Super-Resolution GAN	Implementation of Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network.	`Keras`	MIT License

↥ Back To Top

PyTorch

Model Name	Description	Framework	License
detectron2	Detectron2 is Facebook AI Research's next generation software system that implements state-of-the-art object detection algorithms	`PyTorch`	Apache License 2.0
FastPhotoStyle	A Closed-form Solution to Photorealistic Image Stylization.	`PyTorch`	Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public Licens
pytorch-CycleGAN-and-pix2pix	A Closed-form Solution to Photorealistic Image Stylization.	`PyTorch`	BSD License
maskrcnn-benchmark	Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.	`PyTorch`	MIT License
deep-image-prior	Image restoration with neural networks but without learning.	`PyTorch`	Apache License 2.0
StarGAN	StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation.	`PyTorch`	MIT License
faster-rcnn.pytorch	This project is a faster faster R-CNN implementation, aimed to accelerating the training of faster R-CNN object detection models.	`PyTorch`	MIT License
pix2pixHD	Synthesizing and manipulating 2048x1024 images with conditional GANs.	`PyTorch`	BSD License
Augmentor	Image augmentation library in Python for machine learning.	`PyTorch`	MIT License
albumentations	Fast image augmentation library.	`PyTorch`	MIT License
Deep Video Analytics	Deep Video Analytics is a platform for indexing and extracting information from videos and images	`PyTorch`	Custom
semantic-segmentation-pytorch	Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset.	`PyTorch`	BSD 3-Clause License
An End-to-End Trainable Neural Network for Image-based Sequence Recognition	This software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC loss for image-based sequence recognition tasks, such as scene text recognition and OCR.	`PyTorch`	The MIT License (MIT)
UNIT	PyTorch Implementation of our Coupled VAE-GAN algorithm for Unsupervised Image-to-Image Translation.	`PyTorch`	Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License
Neural Sequence labeling model	Sequence labeling models are quite popular in many NLP tasks, such as Named Entity Recognition (NER), part-of-speech (POS) tagging and word segmentation.	`PyTorch`	Apache License
faster rcnn	This is a PyTorch implementation of Faster RCNN. This project is mainly based on py-faster-rcnn and TFFRCNN. For details about R-CNN please refer to the paper Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks by Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun.	`PyTorch`	MIT License
pytorch-semantic-segmentation	PyTorch for Semantic Segmentation.	`PyTorch`	MIT License
EDSR-PyTorch	PyTorch version of the paper 'Enhanced Deep Residual Networks for Single Image Super-Resolution'.	`PyTorch`	MIT License
image-classification-mobile	Collection of classification models pretrained on the ImageNet-1K.	`PyTorch`	MIT License
FaderNetworks	Fader Networks: Manipulating Images by Sliding Attributes - NIPS 2017.	`PyTorch`	Creative Commons Attribution-NonCommercial 4.0 International Public License
neuraltalk2-pytorch	Image captioning model in pytorch (finetunable cnn in branch with_finetune).	`PyTorch`	MIT License
RandWireNN	Implementation of: "Exploring Randomly Wired Neural Networks for Image Recognition".	`PyTorch`	Not Found
stackGAN-v2	Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++.	`PyTorch`	MIT License
Detectron models for Object Detection	This code allows to use some of the Detectron models for object detection from Facebook AI Research with PyTorch.	`PyTorch`	Apache License
DEXTR-PyTorch	This paper explores the use of extreme points in an object (left-most, right-most, top, bottom pixels) as input to obtain precise object segmentation for images and videos.	`PyTorch`	GNU GENERAL PUBLIC LICENSE
pointnet.pytorch	Pytorch implementation for "PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation.	`PyTorch`	MIT License
self-critical.pytorch	This repository includes the unofficial implementation Self-critical Sequence Training for Image Captioning and Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering.	`PyTorch`	MIT License
vnet.pytorch	A Pytorch implementation for V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation.	`PyTorch`	BSD 3-Clause License
piwise	Pixel-wise segmentation on VOC2012 dataset using pytorch.	`PyTorch`	BSD 3-Clause License
pspnet-pytorch	PyTorch implementation of PSPNet segmentation network.	`PyTorch`	Not Found
pytorch-SRResNet	Pytorch implementation for Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network.	`PyTorch`	The MIT License (MIT)
PNASNet.pytorch	PyTorch implementation of PNASNet-5 on ImageNet.	`PyTorch`	Apache License
img_classification_pk_pytorch	Quickly comparing your image classification models with the state-of-the-art models.	`PyTorch`	Not Found
Deep Neural Networks are Easily Fooled	High Confidence Predictions for Unrecognizable Images.	`PyTorch`	MIT License
pix2pix-pytorch	PyTorch implementation of "Image-to-Image Translation Using Conditional Adversarial Networks".	`PyTorch`	Not Found
NVIDIA/semantic-segmentation	A PyTorch Implementation of Improving Semantic Segmentation via Video Propagation and Label Relaxation, In CVPR2019.	`PyTorch`	CC BY-NC-SA 4.0 license
Neural-IMage-Assessment	A PyTorch Implementation of Neural IMage Assessment.	`PyTorch`	Not Found
torchxrayvision	Pretrained models for chest X-ray (CXR) pathology predictions. Medical, Healthcare, Radiology	`PyTorch`	Apache License
pytorch-image-models	PyTorch image models, scripts, pretrained weights -- (SE)ResNet/ResNeXT, DPN, EfficientNet, MixNet, MobileNet-V3/V2, MNASNet, Single-Path NAS, FBNet, and more	`PyTorch`	Apache License 2.0

↥ Back To Top

Caffe

Model Name	Description	Framework	License
OpenPose	OpenPose represents the first real-time multi-person system to jointly detect human body, hand, and facial keypoints (in total 130 keypoints) on single images.	`Caffe`	Custom
Fully Convolutional Networks for Semantic Segmentation	Fully Convolutional Models for Semantic Segmentation.	`Caffe`	Not Found
Colorful Image Colorization	Colorful Image Colorization.	`Caffe`	BSD-2-Clause License
R-FCN	R-FCN: Object Detection via Region-based Fully Convolutional Networks.	`Caffe`	MIT License
cnn-vis	Inspired by Google's recent Inceptionism blog post, cnn-vis is an open-source tool that lets you use convolutional neural networks to generate images.	`Caffe`	The MIT License (MIT)
DeconvNet	Learning Deconvolution Network for Semantic Segmentation.	`Caffe`	Custom

↥ Back To Top

MXNet

Model Name	Description	Framework	License
Faster RCNN	Region Proposal Network solves object detection as a regression problem.	`MXNet`	Apache License, Version 2.0
SSD	SSD is an unified framework for object detection with a single network.	`MXNet`	MIT License
Faster RCNN+Focal Loss	The code is unofficial version for focal loss for Dense Object Detection.	`MXNet`	Not Found
CNN-LSTM-CTC	I realize three different models for text recognition, and all of them consist of CTC loss layer to realize no segmentation for text images.	`MXNet`	Not Found
Faster_RCNN_for_DOTA	This is the official repo of paper DOTA: A Large-scale Dataset for Object Detection in Aerial Images.	`MXNet`	Apache License
RetinaNet	Focal loss for Dense Object Detection.	`MXNet`	Not Found
MobileNetV2	This is a MXNet implementation of MobileNetV2 architecture as described in the paper Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation.	`MXNet`	Apache License
neuron-selectivity-transfer	This code is a re-implementation of the imagenet classification experiments in the paper Like What You Like: Knowledge Distill via Neuron Selectivity Transfer.	`MXNet`	Apache License
MobileNetV2	This is a Gluon implementation of MobileNetV2 architecture as described in the paper Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation.	`MXNet`	Apache License
sparse-structure-selection	This code is a re-implementation of the imagenet classification experiments in the paper Data-Driven Sparse Structure Selection for Deep Neural Networks.	`MXNet`	Apache License
FastPhotoStyle	A Closed-form Solution to Photorealistic Image Stylization.	`MXNet`	Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License

↥ Back To Top

Contributions

Your contributions are always welcome!! Please have a look at contributing.md

License

MIT License

rnjbdya/CV-pretrained-model

Computer Vision Pretrained Models

What is pre-trained Model?

Other Pre-trained Models

Model Deployment library

Framework

Model visualization

Tensorflow

Keras

PyTorch

Caffe

MXNet

Contributions

License