
Simple mmdetection CPU inference

Primary LanguagePythonApache License 2.0Apache-2.0

Simple mmdetection

Overall goal of this project is to implement most popular, inference only, CPU friendly object detection models from mmdetection framework for research and production use. Currently mmdetection does not support CPU only inference mode (see here), however in real life, production models are rarely deployed on GPU enabled environments. SMD solves this problem.


  • Create foundation for better understanding and research of CPU-only DNN performance
  • All mmdetection pretrained weights can be directly used with SMD (mmedtecection model ZOO).
  • SMD limits number of dependencies to: torch, torchvision, PIL and numpy
  • Wherever possible mmdetection specific code is replaced with torch and torchvision alternatives (transforms, nms etc.)


  • By design this code has no training capabilities at all. Training specific code is either removed or reduced to the bare minimum. For training, finetuning or transfer learning use mmdetection you can then just use trained model wit smd for CPU only inference.

Implemented architectures

  • TorchScript support (current priority)
  • RetinaNet with FPN and ResNet 50 backbone
  • RetinaNet with FPN and ResNet 101 backbone
  • Faster R-CNN with FPN and ResNet 50 backbone
  • Faster R-CNN with FPN and ResNet 101 backbone
  • Mask R-CNN with FPN and ResNet 50 backbone
  • Mask R-CNN with FPN and ResNet 101 backbone
  • RetinaNet with FPN and ResNet 50 with deformable convolutions backbone
  • RetinaNet with FPN and ResNet 101 with deformable convolutions backbone
  • SSD 300
  • SSD 512
  • FoveaBox


pip install -r requirements.txt

Sample code

See demo jupyter notebook complete example

from models.detectors import create_detector
import torch
import torchvision
import cv2
import numpy as np
from matplotlib import pyplot as plt

# download pretrained mmdetection model from model zoo

# create RetinaNet with ResNet 101 backbone, and pretrained COCO weights
# Note: COCO has 80 classes plus one background class. You can use Your own model. Just set You number of classes and feed
# pretrained checkpoint.
retina = create_detector('retinanet_r101_fpn', number_of_classes=81, pretrained='retinanet_r101_fpn_1x_20181129-f016f384.pth')

# with pytorch 1.3, model can be easily quantized (better CPU performance, smaller footprint).
retina = torch.quantization.quantize_dynamic(retina, dtype=torch.qint8)

# inference result is exactly the same like in mmdetection
with torch.no_grad():
    result = retina.detect('demo.jpg')

res = []

# Look for cars in COCO dataset, with threshold 0.3
for r in result[2]:
    if r[-1] >= .3:

if len(res) > 0:
    im = cv2.imread('demo.jpg')
    for r in res:
        cv2.rectangle(im, (r[0], r[1]), (r[2], r[3]), (0, 255, 255), 3)
        cv2.putText(im, "Car", (r[0]-3, r[1]-3), cv2.FONT_HERSHEY_PLAIN, 2, (0, 255, 255), 3)

im = cv2.cvtColor(im, cv2.COLOR_RGB2BGR)


GitHub Logo