/pretrained-models.pytorch

Pretrained ConvNets for pytorch: ResNeXt101, ResNet152, InceptionV4, InceptionResnetV2, etc.

Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

Pretrained models for Pytorch (Work in progress)

The goal of this repo is:

  • to help to reproduce research papers results (transfer learning setups for instance),
  • to access pretrained ConvNets with a unique interface/API inspired by torchvision.

News:

  • 22/07/2017: torchvision pretrained models
  • 22/07/2017: momentum in inceptionv4 and inceptionresnetv2 to 0.1
  • 17/07/2017: model.input_range attribut
  • 17/07/2017: BNInception pretrained on ImageNet

Summary

Installation

  1. python3 with anaconda
  2. pytorch with/out CUDA
  3. git clone https://github.com/Cadene/pretrained-models.pytorch.git

Toy example

  • See test/toy-example.py to compute logits of classes appearance with pretrained models on imagenet.

python test/toy-example.py -a fbresnet152

from PIL import Image
import torch
import torchvision.transforms as transforms

import sys
sys.path.append('yourdir/pretrained-models.pytorch') # if needed
import pretrainedmodels

# Load Model
model_name = 'inceptionresnetv4' #fbresnet152
model = pretrainedmodels.__dict__[model_name](num_classes=1000, pretrained='imagenet')
model.eval()

# Load One Input Image
path_img = 'data/cat.jpg'
with open(path_img, 'rb') as f:
    with Image.open(f) as img:
        input_data = img.convert(model.input_space)

tf = transforms.Compose([
    transforms.Scale(round(max(model.input_size)*1.143)),
    transforms.CenterCrop(max(model.input_size)),
    transforms.ToTensor(),
    transforms.Normalize(mean=model.mean, std=model.std)
])

input_data = tf(input_data)          # 3x400x225 -> 3x299x299
input_data = input_data.unsqueeze(0) # 3x299x299 -> 1x3x299x299
input = torch.autograd.Variable(input_data)

# Load Imagenet Synsets
with open('data/imagenet_synsets.txt', 'r') as f:
    synsets = f.readlines()

# len(synsets)==1001
# sysnets[0] == background
synsets = [x.strip() for x in synsets]
splits = [line.split(' ') for line in synsets]
key_to_classname = {spl[0]:' '.join(spl[1:]) for spl in splits}

with open('data/imagenet_classes.txt', 'r') as f:
    class_id_to_key = f.readlines()

class_id_to_key = [x.strip() for x in class_id_to_key]

# Make predictions
output = model(input) # size(1, 1000)
max, argmax = output.data.squeeze().max(0)
class_id = argmax[0]
class_key = class_id_to_key[class_id]
classname = key_to_classname[class_key]

print(path_img, 'is a', classname) 

Evaluation on imagenet

Accuracy on validation set

Model Version Acc@1 Acc@5
InceptionResNetV2 Tensorflow 80.4 95.3
InceptionV4 Tensorflow 80.2 95.3
InceptionResNetV2 Our porting 80.170 95.234
InceptionV4 Our porting 80.062 94.926
ResNeXt101_64x4d Torch7 79.6 94.7
ResNeXt101_64x4d Our porting 78.956 94.252
ResNeXt101_32x4d Torch7 78.8 94.4
ResNet152 Pytorch 78.428 94.110
ResNeXt101_32x4d Our porting 78.188 93.886
FBResNet152 Torch7 77.84 93.84
DenseNet161 Pytorch 77.560 93.798
FBResNet152 Our porting 77.386 93.594
InceptionV3 Pytorch 77.294 93.454
DenseNet201 Pytorch 77.152 93.548
ResNet101 Pytorch 77.438 93.672
DenseNet169 Pytorch 76.026 92.992
ResNet50 Pytorch 76.002 92.980
DenseNet121 Pytorch 74.646 92.136
VGG19_BN Pytorch 74.266 92.066
ResNet34 Pytorch 73.554 91.456
BNInception Caffe 73.522 91.560
VGG16_BN Pytorch 73.518 91.608
VGG19 Pytorch 72.080 90.822
VGG16 Pytorch 71.636 90.354
VGG13_BN Pytorch 71.508 90.494
VGG11_BN Pytorch 70.452 89.818
ResNet18 Pytorch 70.142 89.274
VGG13 Pytorch 69.662 89.264
VGG11 Pytorch 68.970 88.746
SqueezeNet1_1 Pytorch 58.250 80.800
SqueezeNet1_0 Pytorch 58.108 80.428
Alexnet Pytorch 56.432 79.194

Note: the Pytorch version of ResNet152 is not a porting of the Torch7 but has been retrained by facebook.

Beware, the accuracy reported here is not always representative of the transferable capacity of the network on other tasks and datasets. You must try them all! :P

Reproducing results

Download the ImageNet dataset and move validation images to labeled subfolders

python test/imagenet.py /local/data/imagenet_2012/images --arch resnext101_32x4d -e

Documentation

Available models

FaceBook ResNet*

Source: Torch7 repo of FaceBook

There are a bit different from the ResNet* of torchvision. ResNet152 is currently the only one available.

  • fbresnet152(num_classes=1000, pretrained='imagenet')

Inception*

Source: TensorFlow Slim repo and Pytorch/Vision repo for inceptionv3

  • inceptionresnetv2(num_classes=1000, pretrained='imagenet')
  • inceptionresnetv2(num_classes=1001, pretrained='imagenet+background')
  • inceptionv4(num_classes=1000, pretrained='imagenet')
  • inceptionv4(num_classes=1001, pretrained='imagenet+background')
  • inceptionv3(num_classes=1000, pretrained='imagenet')

BNInception

Source: Trained with Caffe by Xiong Yuanjun

  • bninception(num_classes=1000, pretrained='imagenet')

ResNeXt*

Source: ResNeXt repo of FaceBook

  • resnext101_32x4d(num_classes=1000, pretrained='imagenet')
  • resnext101_62x4d(num_classes=1000, pretrained='imagenet')

TorchVision

Source: Pytorch/Vision repo

(inceptionv3 included in Inception*)

  • resnet18(num_classes=1000, pretrained='imagenet')
  • resnet34(num_classes=1000, pretrained='imagenet')
  • resnet50(num_classes=1000, pretrained='imagenet')
  • resnet101(num_classes=1000, pretrained='imagenet')
  • resnet152(num_classes=1000, pretrained='imagenet')
  • densenet121(num_classes=1000, pretrained='imagenet')
  • densenet161(num_classes=1000, pretrained='imagenet')
  • densenet169(num_classes=1000, pretrained='imagenet')
  • densenet201(num_classes=1000, pretrained='imagenet')
  • squeezenet1_0(num_classes=1000, pretrained='imagenet')
  • squeezenet1_1(num_classes=1000, pretrained='imagenet')
  • alexnet(num_classes=1000, pretrained='imagenet')
  • vgg11(num_classes=1000, pretrained='imagenet')
  • vgg13(num_classes=1000, pretrained='imagenet')
  • vgg16(num_classes=1000, pretrained='imagenet')
  • vgg19(num_classes=1000, pretrained='imagenet')
  • vgg11_bn(num_classes=1000, pretrained='imagenet')
  • vgg13_bn(num_classes=1000, pretrained='imagenet')
  • vgg16_bn(num_classes=1000, pretrained='imagenet')
  • vgg19_bn(num_classes=1000, pretrained='imagenet')

Model API

Once a pretrained model has been loaded, you can use it that way.

Important note: All image must be loaded using PIL which scales the pixel values between 0 and 1.

model.input_size

Attribut of type list composed of 3 numbers:

  • number of color channels,
  • height of the input image,
  • width of the input image.

Example:

  • [3, 299, 299] for inception* networks,
  • [3, 224, 224] for resnet* networks.

model.input_space

Attribut of type str representating the color space of the image. Can be RGB or BGR.

model.input_range

Attribut of type list composed of 2 numbers:

  • min pixel value,
  • max pixel value.

Example:

  • [0, 1] for resnet* and inception* networks,
  • [0, 255] for bninception network.

model.mean

Attribut of type list composed of 3 numbers which are used to normalize the input image (substract "color-channel-wise").

Example:

  • [0.5, 0.5, 0.5] for inception* networks,
  • [0.485, 0.456, 0.406] for resnet* networks.

model.std

Attribut of type list composed of 3 numbers which are used to normalize the input image (divide "color-channel-wise").

Example:

  • [0.5, 0.5, 0.5] for inception* networks,
  • [0.229, 0.224, 0.225] for resnet* networks.

model.features

/!\ work in progress (may not be available)

Method which is used to extract the features from the image.

Example when the model is loaded using fbresnet152:

print(input_224.size())            # (1,3,224,224)
output = model.features(input_224) 
print(output.size())               # (1,2048,1,1)

# print(input_448.size())          # (1,3,448,448)
output = model.features(input_448)
# print(output.size())             # (1,2048,7,7)

model.classif

/!\ work in progress (may not be available)

Method which is used to classify the features from the image.

Example when the model is loaded using fbresnet152:

output = model.features(input_224) 
output = output.view(1,-1)
print(output.size())               # (1,2048)
output = model.classif(output)
print(output.size())               # (1,1000)

model.forward

Method used to call model.features and model.classif. It can be overwritten as desired.

Important note: A good practice is to use model.__call__ as your function of choice to forward an input to your model. See the example bellow.

# Without model.__call__
output = model.forward(input_224)
print(output.size())      # (1,1000)

# With model.__call__
output = model(input_224)
print(output.size())      # (1,1000)

Reproducing

Hand porting of ResNet152

th pretrainedmodels/fbresnet/resnet152_dump.lua
python pretrainedmodels/fbresnet/resnet152_load.py

Automatic porting of ResNeXt

https://github.com/clcarwin/convert_torch_to_pytorch

Hand porting of InceptionV4 and InceptionResNetV2

https://github.com/Cadene/tensorflow-model-zoo.torch

Acknowledgement

Thanks to the deep learning community and especially to the contributers of the pytorch ecosystem.