Mask_RCNN-Vizzy_Hand

The purpose of this repository is to use Mask RCNN for 2D segmentation of a humanoid robotic hand.

For the implementation of the network's architecture, we use the matterport implementation. This library can be found in my_mrcnn. The reason we include the library in our repository is due to some minor changes we did in their code.

In hand is where we have our main code to handle the datasets, extend some functions of my_mrcnn, configure hyperparameters and create the training/inference processes.

In utils we also provide some utility functions, like evaluation metrics, pre-processing functions, an annotation tool to generate groundtruth masks from real images, amongst other utilities.

To generate images for training and validation, we also provide a Unity framework to generate simulated images, available in Unity_package.

Training

  • Prepare train/val data: Place the RGB images into a folder called "images" and the groundtruth binary masks into a folder called "mask". Both folder must be in the same directory. The masks are RGB images, where the positive pixels have a RGB value of (0, 0, 0) and negative pixels have a RGB value of (255, 255, 255).

  • Download Mask RCNN COCO pre-trained weights.

  • To train the network, for the new task, run the terminal command (inside the "hand" folder): python3 hand.py -m=train -d=/path/to/logs -i=/path/to/train_val_images_masks -w=/path/to/pre-trained_weights

Inference

  • To evaluate a model, on the validation set used to configure hyperparameters, run the terminal command: python3 hand.py -m=test -d=/ -i=/path/to/train_val_images_masks -w=/ -p=/path/to/model

  • To get the evaluation on a new test dataset, unseen by the model, run the terminal command: python3 hand.py -m=test -d=/ -i=/ -w=/ -t -p=/path/to/model

Results of our final model on test images

Test image 1

Test image 2

Test image 3

Test image 4