Boundary-enhanced Co-training for Weakly Supervised Semantic Segmentation
Shenghai Rong, Bohai Tu, Zilei Wang, Junjie Li
- Python 3.8, PyTorch 1.11.0, and more in requirements.txt
- PASCAL VOC 2012 devkit
- NVIDIA GPU with more than 24GB of memory
$ pip install -r requirements.txt
- Download Pascal VOC2012 dataset from the official dataset homepage.
- Download ImageNet pretrained model of DeeplabV3+ from mmclassification .
- And rename the downloaded pth as "resnetv1d101_mmcv.pth"
-
Download ImageNet pretrained model of DeeplabV2 from pytorch .
-
And rename the downloaded pth as "resnet-101_v2.pth"
-
Download ImageNet pretrained model of MiT-B2 from SegFormer .
- Please refer to ./first-stage/irn/README.md for details.
- After generating pseudo-labels and confidence masks, please rename their directories as "irn_pseudo_label" and "irn_mask" respectively.
- The generated irn_pseudo_label and irn_mask are also provided here for reproducing our method more directly. [Google Drive] / [Baidu Drive]
$ cd BECO/
$ mkdir data
$ mkdir data/model_zoo
$ mkdir data/logging
And put the data and pretrained model in the corresponding directories like:
data/
--- VOC2012/
--- Annotations/
--- ImageSet/
--- JPEGImages/
--- SegmentationClass/
--- ...
--- irn_pseudo_label/
--- ****.png
--- ****.png
--- irn_mask/
--- ****.png
--- ****.png
--- model_zoo/
--- resnetv1d101_mmcv.pth
--- logging/
$ CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py -dist --logging_tag beco1
This code also supports AMP acceleration to reduce the GPU memory cost in half. Note that the "batch_size" in main.py refers to the batch_size of per GPU. So you should modify it when using different numbers of GPUs to keep the total batch_size of 16.
$ CUDA_VISIBLE_DEVICES=0,1 python main.py -dist --logging_tag beco1 --amp
$ CUDA_VISIBLE_DEVICES=0 python main.py --test --logging_tag beco1 --ckpt best_ckpt.pth
Please refer to pydensecrf to install CRF python library for testing with the CRF post-processing.
$ python test.py --crf --logits_dir ./data/logging/beco1/logits --mode "val"
Method | Dataset | Backbone | Weights | Val mIoU (w/o CRF) |
---|---|---|---|---|
BECO | VOC2012 | ResNet101 | [Google Drive] / [Baidu Drive] | 70.9 |
BECO | COCO2014 | ResNet101 | [Google Drive] / [Baidu Drive] | 45.6 |
If you find our work inspiring or use our codebase in your research, please consider giving a star ⭐ and a citation.
@InProceedings{Rong_2023_CVPR,
author = {Rong, Shenghai and Tu, Bohai and Wang, Zilei and Li, Junjie},
title = {Boundary-Enhanced Co-Training for Weakly Supervised Semantic Segmentation},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2023},
pages = {19574-19584}
}