Real-time multi-object, segmentation and pose tracking using Yolov8 | Yolo-NAS | YOLOX with DeepOCSORT and LightMBN

Introduction

This repo contains a collections of pluggable state-of-the-art multi-object trackers for object detectors. We provide examples on how to use this package together with popular object detection models such as: Yolov8, Yolo-NAS and YOLOX.

Supported tracking methods

DeepOCSORT , BoTSORT , StrongSORT , OCSORT and ByteTrack . DeepOCSORT, BoTSORT and StrongSORT are based on motion + appearance description; OCSORT and ByteTrack are based on motion only. For the methods using appearance description, lightweight state-of-the-art ReID models (LightMBN , OSNet and more) are downloaded automatically as well.

Tutorials

Experiments

In inverse chronological order:

Why using this tracking toolbox?

Everything is designed with simplicity and flexibility in mind. We don't hyperfocus on results on a single dataset, we prioritize real-world results. If you don't get good tracking results on your custom dataset with the out-of-the-box tracker configurations, use the examples/evolve.py script for tracker hyperparameter tuning.

Installation

Start with Python>=3.8 environment.

If you want to run the YOLOv8, YOLO-NAS or YOLOX examples:

git clone https://github.com/mikel-brostrom/yolo_tracking.git
pip install -v -e .

but if you only want to import the tracking modules you can simply:

pip install boxmot

YOLOv8 | YOLO-NAS | YOLOX examples

Tracking

Yolo models

$ python examples/track.py --yolo-model yolov8n       # bboxes only
  python examples/track.py --yolo-model yolo_nas_s    # bboxes only
  python examples/track.py --yolo-model yolox_n       # bboxes only
                                        yolov8n-seg   # bboxes + segmentation masks
                                        yolov8n-pose  # bboxes + pose estimation

Tracking methods

$ python examples/track.py --tracking-method deepocsort
                                             strongsort
                                             ocsort
                                             bytetrack
                                             botsort

Tracking sources

Tracking can be run on most video formats

$ python examples/track.py --source 0                               # webcam
                                    img.jpg                         # image
                                    vid.mp4                         # video
                                    path/                           # directory
                                    path/*.jpg                      # glob
                                    'https://youtu.be/Zgi9g1ksQHc'  # YouTube
                                    'rtsp://example.com/media.mp4'  # RTSP, RTMP, HTTP stream

Select ReID model

Some tracking methods combine appearance description and motion in the process of tracking. For those which use appearance, you can choose a ReID model based on your needs from this ReID model zoo. These model can be further optimized for you needs by the reid_export.py script

$ python examples/track.py --source 0 --reid-model lmbn_n_cuhk03_d.pt
                                                   osnet_x0_25_market1501.pt
                                                   mobilenetv2_x1_4_msmt17.engine
                                                   resnet50_msmt17.onnx
                                                   osnet_x1_0_msmt17.pt
                                                   ...

Filter tracked classes

By default the tracker tracks all MS COCO classes.

If you want to track a subset of the classes that you model predicts, add their corresponding index after the classes flag,

python examples/track.py --source 0 --yolo-model yolov8s.pt --classes 16 17  # COCO yolov8 model. Track cats and dogs, only

Here is a list of all the possible objects that a Yolov8 model trained on MS COCO can detect. Notice that the indexing for the classes in this repo starts at zero

MOT compliant results

Can be saved to your experiment folder runs/track/exp*/ by

python examples/track.py --source ... --save-mot

Evaluation

Evaluate a combination of detector, tracking method and ReID model on standard MOT dataset or you custom one by

$ python3 examples/val.py --yolo-model yolo_nas_s.pt --reid-model osnetx1_0_dukemtcereid.pt --tracking-method deepocsort --benchmark MOT16
                          --yolo-model yolox_n.pt    --reid-model osnet_ain_x1_0_msmt17.pt  --tracking-method ocsort     --benchmark MOT17
                          --yolo-model yolov8s.pt    --reid-model lmbn_n_market.pt          --tracking-method strongsort --benchmark <your-custom-dataset>

Evolution

We use a fast and elitist multiobjective genetic algorithm for tracker hyperparameter tuning. By default the objectives are: HOTA, MOTA, IDF1. Run it by

$ python examples/evolve.py --tracking-method strongsort --benchmark MOT17 --n-trials 100  # tune strongsort for MOT17
                            --tracking-method ocsort     --benchmark <your-custom-dataset> --objective HOTA # tune ocsort for maximizing HOTA on your custom tracking dataset

The set of hyperparameters leading to the best HOTA result are written to the tracker's config file.

Custom object detection model example

Minimalistic

from boxmot import DeepOCSORT
from pathlib import Path


tracker = DeepOCSORT(
  model_weights=Path('osnet_x0_25_msmt17.pt'),  # which ReID model to use
  device='cuda:0',  # 'cpu', 'cuda:0', 'cuda:1', ... 'cuda:N'
  fp16=True,  # wether to run the ReID model with half precision or not
)

cap = cv.VideoCapture(0)
while True:
    ret, im = cap.read()
    ...
    # dets (numpy.ndarray):
    #  - your model's nms:ed outputs of shape Nx6 (x, y, x, y, conf, cls)
    # im   (numpy.ndarray):
    #  - the original hxwx3 image (for better ReID results)
    #  - the downscaled hxwx3 image fed to you model (faster)
    tracker_outputs = tracker.update(dets, im)  # --> (x, y, x, y, id, conf, cls)
    ...

Complete

from boxmot import DeepOCSORT
from pathlib import Path
import cv2
import numpy as np

tracker = DeepOCSORT(
    model_weights=Path('osnet_x0_25_msmt17.pt'), # which ReID model to use
    device='cuda:0',
    fp16=True,
)

vid = cv2.VideoCapture(0)
color = (0, 0, 255)  # BGR
thickness = 2
fontscale = 0.5

while True:
    ret, im = vid.read()

    # substitute by your object detector, output has to be N X (x, y, x, y, conf, cls)
    dets = np.array([[144, 212, 578, 480, 0.82, 0],
                    [425, 281, 576, 472, 0.56, 65]])

    ts = tracker.update(dets, im) # --> (x, y, x, y, id, conf, cls)

    xyxys = ts[:, 0:4].astype('int') # float64 to int
    ids = ts[:, 4].astype('int') # float64 to int
    confs = ts[:, 5]
    clss = ts[:, 6]

    # print bboxes with their associated id, cls and conf
    if ts.shape[0] != 0:
        for xyxy, id, conf, cls in zip(xyxys, ids, confs, clss):
            im = cv2.rectangle(
                im,
                (xyxy[0], xyxy[1]),
                (xyxy[2], xyxy[3]),
                color,
                thickness
            )
            cv2.putText(
                im,
                f'id: {id}, conf: {conf}, c: {cls}',
                (xyxy[0], xyxy[1]-10),
                cv2.FONT_HERSHEY_SIMPLEX,
                fontscale,
                color,
                thickness
            )

    # show image with bboxes, ids, classes and confidences
    cv2.imshow('frame', im)

    # break on pressing q
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

vid.release()
cv2.destroyAllWindows()

Contact

For Yolov8 tracking bugs and feature requests please visit GitHub Issues. For business inquiries or professional support requests please send an email to: yolov5.deepsort.pytorch@gmail.com

parmalatinter/yolo_tracking_colab