/FuseNet_PyTorch

Joint scene classification and semantic segmentation with FuseNet

Primary LanguageJupyter Notebook

FuseNet

Joint scene classification and semantic segmentation using FuseNet architecture from FuseNet: incorporating depth into semantic segmentation via fusion-based CNN architecture. Potential effects of additional scene classification loss on the overall semantic segmentation quality are tested.

Dependencies

Datasets

NYU

1.Simply download the processed .h5py dataset with 40 annotations and 10 classes here: train + test set. (TODO: When running Train_FuseNet.py download the dataset automatically if not found)

SUNRGBD

  1. TODO: Download the dataset here

  2. TODO: Download class weight file here .

Training

  • To train Fusenet run Train_FuseNet.py. Dataset choice is manually implemented in the script for now. The dataset is taken and prepared by utils/data_utils_class.py, therefore make sure to give the correct path in the script. (TODO: Pass the arguments instead of manually entering in the script)

  • Note: VGG weights are downloaded automatically at the beginning of the training process. Depth layers weights will also be initialized with their vgg16 equivalent layers. However, for 'conv1_1' the weights will be averaged to fit one channel depth input (3, 3, 3, 64) -> (3, 3, 1, 64)

Evaluation

  • To evaluate Fusenet results, locate the trained model file and use FuseNet_Class_Plots.ipynb.

To-Do

  • Modularize the code

Citing FuseNet

Caner Hazirbas, Lingni Ma, Csaba Domokos and Daniel Cremers, "FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture", in proceedings of the 13th Asian Conference on Computer Vision, 2016. (pdf)

@inproceedings{fusenet2016accv,
 author    = "C. Hazirbas and L. Ma and C. Domokos and D. Cremers",
 title     = "FuseNet: incorporating depth into semantic segmentation via fusion-based CNN architecture",
 booktitle = "Asian Conference on Computer Vision",
 year      = "2016",
 month     = "November",
}