Instance Segmentation using Mask R-CNN on a Custom Dataset

Authors

Arpit Aggarwal Shantam Bajpai

Brief Introduction to the Project

For this project we will be addressing the task of Instance Segmentation, which combines object detection and semantic segmentation into a per-pixel object detection framework using a pre-trained Mask R-CNN model which will be fine tuned according to our dataset.

Software Required

To run the jupyter notebooks and Python files, use Python 3. Standard libraries like numpy and PyTorch are used.

Steps to Download the dataset

pip install cython
pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
wget https://www.cis.upenn.edu/~jshi/ped_html/PennFudanPed.zip .
unzip PennFudanPed.zip

Dataset Description

The dataset used in this project is PedFudan Dataset. The number of classes are 2, the pedestrian and the background.

Steps for running the code

The training was done using the file "Code/train.py". The model was pre-trained on COCO dataset and consists of a Resnet-50 FPN which will act as a backbone to the Mask R-CNN Model and will output a feature map which is then fed into the Region Proposal Network that finds the image regions likely to contain objects. Mask R-CNN is built on top of Faster R-CNN, that is, it also provides segmentation masks for each instance.
To run the train.py file follow the following commands(make sure GPU is available):

python Code/train.py <dataset_path>

For example, the steps to run the train script were:

python Code/train.py /home/arpitdec5/Desktop/PennFudanPed

To run the test.py file follow the commands below:

python Code/test.py <dataset_path> <model_path>

Credits

The following links helped in completing the project:

bshantam97/mask_rcnn_instance_segmentation