This is the code for my project "Multiple Object Tracking for Military Vehicles", which is a part of my BSc in electrical engineering at Technion Israel. The project was done under the supervision of Gabi Davidov PhD - Thanks for his guidance and support durring the whole process.
The project can be divided into roughly 3 parts:
-
Create a Custom Dataset: After looking online, one can see that a dataset for military vehicles detection can't be found. Therefore, a custom dataset containing around 4000 images was collected and labeled for the project.
-
Train an Object Detection Model: This project uses YOLOv5[3] for object detection. As popular datasets used for training (such as COCO, ImageNet, etc.) have limited amount of military vehicles images, training an object on a custom dataset is necessary.
-
Combine with an Object Tracking Model: After obtaining the object detections (bounding box and class for the objects in each frame), the purpose of the tracking phase is to understand the relation between the objects over different frames. For this purpose, the DeepSort[1] algorithm was chosen, with a pre-trained Pytorch implementation[2].
Repository Structure
├─ src
│ ├─ deep_sort_pytorch
│ ├─ utils
│ │ ├─ common_images_dataset_downloader.ipynb
│ │ ├─ download_Udacity_self_driving_car_dataset.ipynb
│ │ ├─ feature_matching_LoFTR.ipynb
│ │ ├─ Google_images.ipynb
│ │ ├─ Google_images.py
│ │ ├─ super_resolution.ipynb
│ │ ├─ super_resolution.py
│ │ └─ README.md
│ ├─ data_utils.py
│ ├─ plot_utils.py
│ ├─ tracker.py
│ └─ video.py
├─ figures
├─ notebooks
│ ├─ Compare Detectors.ipynb
│ ├─ test.ipynb
│ └─ Train YOLOv5.ipynb
└─ README.md
The usage of the object tracking model is pretty straightforward, and should be similar to this snippet:
import torch
from src.video import Video
from src.tracker import MultiObjectTracker
from src.plot_utils import plot_bounding_boxes
# initialize a video
video = Video(f'{test_videos_path}/{video_name}')
# initialize object detector
detector = torch.hub.load('ultralytics/yolov5', 'custom', weights_path).to(device)
# initialize object tracker
tracker = MultiObjectTracker(video, results_path, detector)
# iterate over the frames in the video
for frame, bounding_boxes in tracker:
plot_bounding_boxes(frame, bounding_boxes)
tracker.video_writer.write(frame)
# save the results
tracker.video_writer.release()
It's recommended to check out the example notebook:
- Automatically download trained models and test videos (from this Release)
- Show all steps required in order to run the detectors
- Display the results on the test videos after done the tracking
- Using Google Colab Run all the code on the cloud and allows using free GPU
Note: This explanation won't cover everything about training a YOLOv5 model, but all the necessary information can be found on the YOLOv5 GitHub Repository.
How to Organize a YOLOv5 Dataset?
- All images and labels should be inside a folder named
dataset
, located at the same level as the YOLO folder. - Each image should have a matching label file, with the same name and of
.txt
type.
├─ yolov5
└─ dataset
├─ images
│ ├─ train
│ │ ├─ file1.jpg
│ │ └─ file2.jpg
│ └─ val
│ ├─ file3.jpg
│ └─ file4.jpg
└─ labels
├─ train
│ ├─ file1.txt
│ └─ file2.txt
└─ val
├─ file3.txt
└─ file4.txt
Each label file should contain all bounding boxes in the image, and looks similar to:
0 0.480109 0.631250 0.684532 0.713589
3 0.780210 0.325648 0.125679 0.456123
While:
- Only one bounding box per row.
- Each row represents a bounding box like:
class x_center y_center width height
. - Class values starting from 0.
- Bounding Box coordinates normalized between 0 and 1.
How to Train on a Custom Dataset?
- For this project, I used Google Colab for the training.
- The training pipeline is wrraped in Colab Forms, which allows using the code in gui-like environment, with minimal code:
Note: double-clicking on a form in Colab reveals the code that running in the back. hide the code again by
right-clicking on the cell and choose the hide code
option in the forms
tab.
In order to initialize the training file, the following should be filled:
Content | Notebook |
---|---|
DATASET_IMAGES_PATH |
Path to the zip file containing the images |
DATASET_LABELS_PATH |
Path to the zip file containing the labels |
TEST_VIDEOS_PATH |
Path to the zip file containing the test videos (optional) |
YOLO_YAML_PATH |
Path to the .yaml with the YOLOv5 configuration |
Then, run the cells, and:
- Press the
Show Runtime Details
to ensure that GPU acceleration is on. - Press the
Get train dataset
(it will download the dataset from the path given earlier). - Press the
Get test videos
(Optional, only if going to run tests on videos).
- After the training is done, the weights and run results can be zipped and downloaded.
- The trained model can now be used to inference on videos or images. See the YOLOv5 repository for more details.