BECO - Official Pytorch Implementation

Boundary-enhanced Co-training for Weakly Supervised Semantic Segmentation

Shenghai Rong, Bohai Tu, Zilei Wang, Junjie Li

Prerequisite

Python 3.8, PyTorch 1.11.0, and more in requirements.txt
PASCAL VOC 2012 devkit
NVIDIA GPU with more than 24GB of memory

Usage

Install python dependencies

$ pip install -r requirements.txt

Download PASCAL VOC 2012 devkit

Download Pascal VOC2012 dataset from the official dataset homepage.

Download ImageNet pretrained model of DeeplabV3+

Download ImageNet pretrained model of DeeplabV3+ from mmclassification .
And rename the downloaded pth as "resnetv1d101_mmcv.pth"

Download ImageNet pretrained model of DeeplabV2 and SegFormer (Optional)

Download ImageNet pretrained model of DeeplabV2 from pytorch .
And rename the downloaded pth as "resnet-101_v2.pth"
Download ImageNet pretrained model of MiT-B2 from SegFormer .

Generate pseudo-labels and confidence masks

Please refer to ./first-stage/irn/README.md for details.
After generating pseudo-labels and confidence masks, please rename their directories as "irn_pseudo_label" and "irn_mask" respectively.
The generated irn_pseudo_label and irn_mask are also provided here for reproducing our method more directly. [Google Drive] / [Baidu Drive]

Prepare the data directory

$ cd BECO/
$ mkdir data
$ mkdir data/model_zoo
$ mkdir data/logging

And put the data and pretrained model in the corresponding directories like:

data/
    --- VOC2012/
        --- Annotations/
        --- ImageSet/
        --- JPEGImages/
        --- SegmentationClass/
        --- ...
    --- irn_pseudo_label/
        --- ****.png
        --- ****.png
    --- irn_mask/
        --- ****.png
        --- ****.png
    --- model_zoo/
        --- resnetv1d101_mmcv.pth
    --- logging/

Weakly Supervised Semantic Segmentation on VOC2012

Train

$ CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py -dist --logging_tag beco1

This code also supports AMP acceleration to reduce the GPU memory cost in half. Note that the "batch_size" in main.py refers to the batch_size of per GPU. So you should modify it when using different numbers of GPUs to keep the total batch_size of 16.

$ CUDA_VISIBLE_DEVICES=0,1 python main.py -dist --logging_tag beco1 --amp

Test

$ CUDA_VISIBLE_DEVICES=0 python main.py --test --logging_tag beco1 --ckpt best_ckpt.pth

Please refer to pydensecrf to install CRF python library for testing with the CRF post-processing.

$ python test.py --crf --logits_dir ./data/logging/beco1/logits --mode "val"

Main Results

Method	Dataset	Backbone	Weights	Val mIoU (w/o CRF)
BECO	VOC2012	ResNet101	[Google Drive] / [Baidu Drive]	70.9
BECO	COCO2014	ResNet101	[Google Drive] / [Baidu Drive]	45.6

Citation

If you find our work inspiring or use our codebase in your research, please consider giving a star ⭐ and a citation.

@InProceedings{Rong_2023_CVPR,
    author    = {Rong, Shenghai and Tu, Bohai and Wang, Zilei and Li, Junjie},
    title     = {Boundary-Enhanced Co-Training for Weakly Supervised Semantic Segmentation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {19574-19584}
}

ShenghaiRong/BECO