/MMTOD

Multi-modal Thermal Object Detector

Primary LanguagePythonMIT LicenseMIT

Python 3.6

Borrow from Anywhere: Pseudo Multimodal Object Detection in Thermal Domain

[Paper Link]

Framework

Preparation:

Clone the repository:

https://github.com/tdchaitanya/MMTOD.git

Create a folder:

mkdir data

prerequisites

  • Python 3.6
  • Pytorch 1.0
  • CUDA 8.0

Data Preparation

  • FLIR ADAS: Dataset can be downloaded from here.

  • KAIST: Dataset can be downloaded from here.

For ease of training, we convert all the annotations into PASCAL-VOC format. To convert the FLIR and KAIST datasets annotations into P-VOC format, use the scripts in the generate_annotations folder.

After converting the annotations, images and annotation files should be arranged in VOC format. The data directory should be organized as follows.

data
├── coco
├── pretrained_model --> resnet101_caffe.pth
└── VOCdevkit2007
    ├── VOC2007
        ├── Annotations
        ├── ImageSets
        │   └── Main --> trainval.txt, test.txt
        └── JPEGImages

trainval.txt and test.txt for FLIR, FLIR-(1/2) and FLIR-(1/4), KAIST datasets are provided in the google drive folder linked below. test.txt is the same for all the FLIR datasets.

Pretrained Model:

We'll be using pre-trained Resnet-101 model for the Faster-RCNN base. You can download the weights from:

Download them and put them into the data/pretrained_model/ directory.

The Thermal-to-RGB translation networks and MMTOD take some decent amount of time to train. So for easy reproduction of results, we're sharing all the weight files, the shared folder is a bit large, so its recommended to only download the files corresponding to the model you wish to reproduce.

Compilation

Install all the python dependencies using pip:

pip install -r requirements.txt

Compile the cuda dependencies using following simple commands:

cd lib
python setup.py build develop

As pointed out in this issue, if you encounter some error during the compilation, you might miss to export the CUDA paths to your environment.

Train

Pick the trainval.txt for FLIR dataset along with the test.txt and place them in ./data/VOCdevkit2007/VOC2007/ImageSets/Main/ folder. trainval.txt is different for each of FLIR, FLIR-1/2, and FLIR-1/4.

1). Single mode Faster-RCNN on FLIR ADAS:

Once you complete all the above steps, training the baseline model should be simple. Execute the following command to start training:

python trainval_net.py --dataset pascal_voc --net res101_thermal --bs 8 --nw 4 --epochs 15 --cuda --use_tfb 

2). MMTOD-UNIT

For using the RGB branch of the network, you'll need to generate a pseudo-RGB image from the input thermal image. For this you'll need pre-trained Thermal-to-RGB UNIT weights, these can be downloaded from unit/models folder in drive . Place these weights in lib/model/unit/models.

Along with rgb2thermal.pt you'll also need VGG16 weights, you can download them from this link, place it in the same folder along with rgb2thermal.pt.

Since we initialize the RGB and Thermal branches with pre-trained weights, you'll need the pre-trained weights for both the branches.

Pre-trained weights for RGB branch can be found in the MS-COCO/res101_coco, PASCAL-VOC/res101_pascal and pre-trained weights for thermal branch can be found in FLIR/res101_thermal in this drive folder. Place the pre-trained thermal weights in models/res101_thermal

  • MS-COCO as RGB Branch Download pre-trained MS-COCO weights from MS-COCO/res101_cocoand place them in the models/res101_coco folder. Start the training by running the following command:
python trainval_unit_update_coco.py --dataset pascal_voc --net res101_unit_update_coco --bs 1 --nw 4 --epochs 15 --cuda
  • PASCAL-VOC as RGB branch:

Download the pre-trained PASCAL-VOC weights from PASCAL-VOC/res101_pascal and place them in the models/res101_pascal folder. Start the training by running the following command:

python trainval_unit_update.py --dataset pascal_voc --net res101_unit_update --bs 1 --nw 4 --epochs 15 --cuda --use_tfb

3). MMTOD-CycleGAN

For Thermal-to-RGB translation you need the RGB-to-Thermal CycleGAN weights, these can be downloaded from cgan/checkpoints/rg2thermal_flir folder in the drive. Place these weights in lib/model/cgan/checkpoints

  • MS-COCO as RGB Branch Download pre-trained MS-COCO weights from MS-COCO/res101_cocoand place them in the models/res101_coco folder. Start the training by running the following command:
python trainval_cgan_update_coco.py --dataset pascal_voc --net res101_cgan_update_coco --bs 4 --nw 4 --epochs 15 --cuda --name rgb2thermal_flir --use_tfb
  • PASCAL-VOC as RGB branch:

Download the pre-trained PASCAL-VOC weights from PASCAL-VOC/res101_pascal and place them in the models/res101_pascal folder. Start the training by running the following command:

python trainval_cgan_update.py --dataset pascal_voc --net res101_cgan_update --bs 4 --nw 4 --epochs 15 --cuda --name rgb2thermal_flir --use_tfb

Training on FLIR-1/2 and FLIR-1/4

For training on FLIR-1/2 and FLIR-1/4, you need to change the trainval.txt file in ./data/VOCdevkit2007/VOC2007/ImageSets/Main/. Replace it with the corresponding file from FLIR-1/2, FLIR-1/4 folders shared in the drive. Follow the same procedure and commands listed above.

Testing and Reproducing results in the paper.

Reproducing results on FLIR dataset:

Baseline: Weights for the baseline are located in FLIR/res101_thermal folder in the drive. Place the folder as it is in models directory and run the following command:

python test_net.py --dataset pascal_voc --net res101_thermal --checksession 1 --checkepoch 15 --checkpoint 1963 --cuda

1). MMTOD-UNIT

  • MS-COCO as RGB Branch

Weights are located in FLIR/res101_unit_update_coco folder

Follow the instructions in the train section to use the Thermal-to-RGB weights. Execute the following command for seeing the results.

python test_net_unit_update.py --dataset pascal_voc --net res101_unit_update_coco --checksession 1 --checkepoch 15 --checkpoint 15717 --cuda 
  • PASCAL-VOC as RGB Branch

Weights are located in FLIR/res101_unit_update folder.

As mentioned for the MS-COCO above, make sure to download the Thermal-to-RGB weight files and place them in the appropriate directory. Execute the following command for seeing the results.

python test_net_unit_update.py --dataset pascal_voc --net res101_unit_update --checksession 1 --checkepoch 15 --checkpoint 15717 --cuda 

2). MMTOD-CGAN

  • MS-COCO as the RGB Branch:

Weights are located in FLIR/res101_cgan_update_coco folder.

Follow the instructions in the train section to use the Thermal-to-RGB weights. Execute the following command for seeing the results.

python test_net_cgan_update.py --dataset pascal_voc --net res101_cgan_update_coco --checksession 1 --checkepoch 15 --checkpoint 3928 --cuda --name rgb2thermal_flir
  • PASCAL-VOC as RGB Branch

Weights are located in FLIR/res101_cgan_update folder.

As mentioned for the MS-COCO above, make sure to download the Thermal-to-RGB weight files and place them in the appropriate directory. Execute the following command for seeing the results.

python test_net_cgan_update.py --dataset pascal_voc --net res101_cgan_update --checksession 1 --checkepoch 15 --checkpoint 3928 --cuda --name rgb2thermal_flir

For reproducing the results on FLIR-1/2 and FLIR-1/4 datasets, download the weights from FLIR-1/2 and FLIR-1/4 directories from the shared drive folder.

Citation

 @InProceedings{Devaguptapu_2019_CVPR_Workshops,
    author = {Devaguptapu, Chaitanya and Akolekar, Ninad and M Sharma, Manuj and N Balasubramanian, Vineeth},
    title = {Borrow From Anywhere: Pseudo Multi-Modal Object Detection in Thermal Imagery},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
    month = {June},
    year = {2019}}

Acknowledgement

This repository is heavily inspired, modified from jwyang/faster-rcnn.pytorch. We make use of junyanz/pytorch-CycleGAN-and-pix2pix and mingyuliutw/UNIT for training the CycleGAN and UNIT models for Thermal-to-RGB translation.