/keras-retinanet

Keras implementation of RetinaNet object detection.

Primary LanguagePythonApache License 2.0Apache-2.0

This version of RetinaNet is inspired by the one of fizyr's github and has been modified to be trained on polarimetric images.

The installation steps are the same than fizyr's github, and you can refer to their README for training and testing on other datasets.

Installation

  1. Clone this repository.
  2. In the repository, execute pip install . --user. Note that due to inconsistencies with how tensorflow should be installed, this package does not define a dependency on tensorflow as it will try to install that (which at least on Arch Linux results in an incorrect installation). Please make sure tensorflow is installed as per your systems requirements.
  3. Alternatively, you can run the code directly from the cloned repository, however you need to run python setup.py build_ext --inplace to compile Cython code first.
  4. Optionally, install pycocotools if you want to train / test on the MS COCO dataset by running pip install --user git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI.

Testing

An example of testing the network can be seen in this Notebook or with this code In general, inference of the network works as follows:

boxes, scores, labels = model.predict_on_batch(inputs)

Where boxes are shaped (None, None, 4) (for (x1, y1, x2, y2)), scores is shaped (None, None) (classification score) and labels is shaped (None, None) (label corresponding to the score). In all three outputs, the first dimension represents the shape and the second dimension indexes the list of detections.

Loading models can be done in the following manner:

from keras_retinanet.models import load_model
model = load_model('/path/to/model.h5', backbone_name='resnet50')

Converting a training model to inference model

The training procedure of keras-retinanet works with training models. These are stripped down versions compared to the inference model and only contains the layers necessary for training (regression and classification values). If you wish to do inference on a model (perform object detection on an image), you need to convert the trained model to an inference model. This is done as follows:

# Running directly from the repository:
keras_retinanet/bin/convert_model.py /path/to/training/model.h5 /path/to/save/inference/model.h5

# Using the installed script:
retinanet-convert-model /path/to/training/model.h5 /path/to/save/inference/model.h5

Most scripts (like retinanet-evaluate) also support converting on the fly, using the --convert-model argument.

Training

keras-retinanet can be trained using this script. Note that the train script uses relative imports since it is inside the keras_retinanet package. If you want to adjust the script for your own use outside of this repository, you will need to switch it to use absolute imports.

If you installed keras-retinanet correctly, the train script will be installed as retinanet-train. However, if you make local modifications to the keras-retinanet repository, you should run the script directly from the repository. That will ensure that your local changes will be used by the train script.

The default backbone is resnet50. You can change this using the --backbone=xxx argument in the running script. xxx can be one of the backbones in resnet models (resnet50, resnet101, resnet152), mobilenet models (mobilenet128_1.0, mobilenet128_0.75, mobilenet160_1.0, etc), densenet models or vgg models. The different options are defined by each model in their corresponding python scripts (resnet.py, mobilenet.py, etc).

Trained models can't be used directly for inference. To convert a trained model to an inference model, check here.

Usage

The pretrained MS COCO model can be downloaded here. Results using the cocoapi are shown below (note: according to the paper, this configuration should achieve a mAP of 0.357).

No fusion

To train on Polar dataset

# Running directly from the repository:
python keras_retinanet/bin/train.py --epochs number_of_epoch --batch-size batch_size --steps number_of_steps_per_epoch --weights /path/to/weights/for/fine/tuning --snapshot-path /path/to/save/snapshots pascal /path/to/dataset/main/folder/ /relative/path/to/the/train/images /relative/path/to/the/train/labels /relative/path/to/the/val/images /relative/path/to/the/val/labels

To evaluate on Polar dataset

# Running directly from the repository:
python keras_retinanet/bin/evaluate.py pascal /path/to/dataset/main/folder/ /relative/path/to/the/test/folder/from/dataset/repository /relative/path/to/the/test/labels/folder/from/dataset/repository /path/to/weights  (--convert-model if needed)

The pretrained Polar models can be downloaded here.

Fusion

This implementation enables to perform early and late fusion between polarimetric and color data.

Early fusion

To train the network with early fusion between two three-channels images:

# Running directly from the repository:
python keras_retinanet/bin/train.py --epochs number_of_epoch --batch-size batch_size --steps number_of_steps_per_epoch --backbone fusion_backbone --weights /path/to/weights/for/fine/tuning --snapshot-path /path/to/save/snapshots pascal-early-fusion /path/to/dataset/main/folder/ /relative/path/to/train/modality1 /relative/path/to/train/modality2/relative/path/to/train/labels /relative/path/to/val/modality1 /relative/path/to/val/modality2 /relative/path/to/val/labels

Note that to achieve this early fusion scheme, i. e. processing a seven-channels image, you must use one of the following backbones: resnet50-multi, resnet101-multi or resnet152-multi.

To evaluate the network with early fusion between two three-channels images:

# Running directly from the repository:
python keras_retinanet/bin/evaluate.py pascal-early-fusion /path/to/dataset/main/folder/ /relative/path/to/test/modality1 /relative/path/to/test/modality2/relative/path/to/test/labels /path/to/weights (--convert-model if needed)

Late fusion

For the late fusion scheme, two models trained on three channels images are used and evaluated according to a well chosen filter.

Before evaluating the models, the two RetinaNet networks must have different layer names to avoid conflicts when loading the weights. The script to rename the weights of RetinaNet50, RetinaNet101 and RetinaNet152 can be foud here

Evaluating with desired filter:

# Running directly from the repository:
python keras_retinanet/bin/evaluate.py pascal-late-fusion /path/to/dataset/main/folder/ /relative/path/to/test/modality1 /relative/path/to/test/modality2 /relative/path/to/test/labels /path/to/first/model/weights --model2=/path/to/second/model/weights (or --model-multimodal if non-stackable polar and RGB images) --filter-style=desired_filter

Note that if images are stackable pixelwise, the option --model2 will be used for second model. If the predicted bounding boxes for color modality (such as RGB) need to be registered towards the polarimetric ones, the option --model-multimodal will be used for second model, the path of color modality weights will be associated to that option.

The available filters ( --filter-style options) are:

  1. Naive NMS filter: --filter-style=naive-fusion and set soft_nms_sigma to 0 (here, line 62)
  2. Naive soft-NMS filter: --filter-style=naive-fusion and set soft_nms_sigma to a value greater than 0 (here, line 62)
  3. Double soft-NMS filter: --filter-style=soft-nms and set soft_nms_sigma to a value greater than 0 (here, line 62)
  4. Or filter: --filter-style=or-filter
  5. AND filter: --filter-style=and-filter

Laplacian pyramids fusion

For this fusion scheme, the two modalities are fused as a pre-processing, following the Laplacian pyramid fusion presented here.

To train using Laplacian pyramid fusion:

# Running directly from the repository:
python keras_retinanet/bin/train.py --epochs number_of_epoch --batch-size batch_size --steps number_of_steps_per_epoch --weights /path/to/weights/for/fine/tuning --snapshot-path /path/to/save/snapshots pascal /path/to/dataset/main/folder/ pascal-pyramid /path/to/dataset/main/folder/ /relative/path/to/train/modality1 /relative/path/to/train/modality2/relative/path/to/train/labels /relative/path/to/val/modality1 /relative/path/to/val/modality2 /relative/path/to/val/labels

To evaluate using Laplacian pyramid fusion:

# Running directly from the repository:
python keras_retinanet/bin/evaluate.py pascal-pyramid /path/to/dataset/main/folder/ /relative/path/to/test/modality1 /relative/path/to/test/modality2 /relative/path/to/test/labels /path/to/model/weights (--convert-model if needed)

Results

Example output images using keras-retinanet are shown below.

Results

On the right results of detection on (I0, I45, I135), in the center, results of detection on (S0, S1, S2) and on the left, results of detection on (I0, AOP, DOP).

Notes

  • This repository requires Keras 2.2.0 or higher.
  • This repository is tested using OpenCV 3.4.
  • This repository is tested using Python 3.6.