Object detection and segmentation using PennFudanPed/ dataset

This folder contains data and various code samples related to using object detectors and object segmentation. The original code was adapted from Pytorch - TorchVision Object Detection Finetuning Tutorial and David Macêdo Github. The intent of this code is to cover all stages in the object detection and segmentation pipeline as a programming practice. It is true that not all aspects can be covered. It uses pre-trained models from Pytorch and the Penn-Fudan Database from here

Folders Description
torchvision_01.py From PennFudanPed it uses torchvision library to read a .PNG image, makes transformations using GPU/CPU and show it on the screen.
torchvision_02.py Takes instance segmentation mask images, transforms from Tensor to Pillow image, after it merges the masks in one image.

Use of tensors and transformation of tensors and images


Basic examples using image transforms offered by torchvision.transforms.functional. Two ways to call the same function.

import torchvision.transforms.functional as F
p_img_01 = F.to_pil_image(tensor_img)
import torchvision.transforms as T
transform = T.ToPILImage()
p_img_01 = transform(tensor_img.to(device))
Folders Description
tensor_conversion_pytorch.py Read images using read_image() conversion, basic pipeline.
tensor_conversion_pil.py Read images using PIL.Image.open() conversion, basic pipeline.
tensor_conversion_opencv.py Read images using OpenCV cv2.imread() conversion, basic pipeline.

Connecting tensor conversion with deep learning models. Examples using MASK R-CNN (from torchvision.models.detection import maskrcnn_resnet50_fpn, maskrcnn_resnet50_fpn(pretrained=True)). The result is a binary mask converted.

Folders Description
tensor_conversion_01.py Read images using read_image() conversion.
tensor_conversion_02.py Read images using PIL.Image.open() conversion.
tensor_conversion_03.py Read images using cv2.imread() conversion.
tensor_conversion_opencv_fasterrcnn.py Read images using cv2.imread() conversion to model FASTER R-CNN and get OpenCV format. This is a good example of conversions in a pipeline with models.
tensor_conversion_opencv_fasterrcnn_02.py Read images using cv2.imread() conversion to model FASTER R-CNN V2 and get OpenCV format. This is a good example of conversions in a pipeline with models.
tensor_conversion_opencv_maskrcnn.py Read images using cv2.imread() conversion to model MASK R-CNN and get OpenCV format. This is a good example of conversions in a pipeline with models.

This link explains, about data type conversion.

Model pipelines for bounding box (BBOX) and mask segmentation (MASK)

Training models

Folders Description
./train_scripts/main_free_gpu_cache.py Tool for clean GPU memory
./train_scripts/main_training_code.py Code to train people detector using PennFudanPed/ dataset. This script produces a file with weights in format .pth
./train_scripts/tv-training-code_corrected.py Original code to train people detector using PennFudanPed/ dataset. This script produces a file with weights in format .pth


Testing bounding box models(BBOX) and mask segmentation models (MASK) sequence in PennFudanPed/

Folders Description
eval_pennfudanpen_bbox_01.py Detecting people using PennFudanPed/ dataset with from torchvision.models.detection.fasterrcnn_resnet50_fpn pretrained model
eval_pennfudanpen_mask_01.py Detecting apples using PennFudanPed/ dataset with from from torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model

Testing bounding box models(BBOX) and mask segmentation models (MASK) sequence in a normal image.

Folders Description
eval_story_rgb_bbox_01.py Detecting people using story_rgb/ dataset with from torchvision.models.detection.fasterrcnn_resnet50_fpn pretrained model
eval_story_rgb_mask_01.py Detecting apples using story_rgb/ dataset with from from torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model
IMPORTANT! eval_story_rgb_mask_02.py Detecting apples using story_rgb/ dataset with from from torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model saving data in an output/ folder

Checking the trained weight in a .pth file with a MASK R-CNN model.

Folders Description
main_evaluate_pennfudanpen_code.py Detecting people using random images from PennFudanPed/ dataset, with torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model and load trained weights from a file .pth
main_evaluate_people_code.py Detecting people using test images torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model and load trained weights from a file .pth

Webcam examples RGB cameras


Folders Description
webcam_basic_loop_01.py Basic loop to extract frames from webcam without object detection.
webcam_obj_detect_01.py It is a simple object detector, it has not enough performance.
webcam_obj_detect_02.py It is a demo using object detection for BBOX. This get a stream from a webcam and detect objects.
webcam_obj_detect_pre_bbox.py It is a demo using object detection for BBOX with pre trained default model MASK R-CNN
webcam_obj_detect_pre_mask.py It is a demo using object detection for MASK with pre trained default model MASK R-CNN


Hardware and software stack used

  • Ubuntu 20.04.3 LTS 64 bits.
  • Windows 10
  • Intel® Core™ i7-8750H CPU @ 2.20GHz × 12.
  • GeForce GTX 1050 Ti Mobile.
  • Python 3.8.10

Edition tools

Python stack environment

Create de environment

python3 -m pip install python-venv
pip3 install python-venv
python -m venv ./object_detector_tutorial_venv
source ./venv/bin/activate
python --version
pip install --upgrade pip

Installing libraries

pip install requirements_windows.txt

Installing in Windows 10

pip install opencv-python

Installing Ubuntu 20.04 LTS

Install Python tools

sudo apt install python3-pip
sudo apt install python3.8-venv

Installing CUDA toolkit Linux notes

Deleting any nvidia data

sudo rm /etc/apt/sources.list.d/cuda*
sudo apt remove --autoremove nvidia-cuda-toolkit
sudo apt remove --autoremove nvidia-*
sudo rm -rf /usr/local/cuda*
sudo apt-get purge nvidia*
sudo apt-get update
sudo apt-get autoremove
sudo apt-get autoclean

Install nvidia-cuda-toolkit

Download the current toolkit available from NVIDIA here

Installing driver

sudo apt-get update
sudo ubuntu-drivers autoinstall

Checking CUDA version installed

nvcc --version