Exploring Intra- and Inter-Video Relation for Surgical Semantic Scene Segmentation

by Yueming Jin, Yang Yu, Cheng Chen, Zixu Zhao, Pheng-Ann Heng, Danail Stoyanov.

Introduction

The Pytorch implementation for our paper 'Exploring Intra- and Inter-Video Relation for Surgical Semantic Scene Segmentation', accepted at IEEE Transactions on Medical Imaging (TMI), 2022.

Dataset

We use the dataset EndoVis18 and CaDIS.

Results

More visual results can be found in this video.

Usage

Check dependencies:

- pytorch 1.8.0
- opencv-python
- tqdm
- timm
- pi
- numpy
- sklearn

Training process

Training Transformer based segmentation model (Intra-video)

Switch folder $ cd ./seg18/
Use $ python train_swin.py to start the training; parameter setting and training script refer to exp.sh

Training Contrastive model (Inter-video)

Switch folder $ cd ./pixcontrast_18/
Use $ sh tools/pixpro_swin_ver.sh to start the training.

Fine-tuning the segmentation model (Joint Intra and Inter)

Switch folder $ cd ./seg18/
Use $ python train_CL_ft_mswin_sgd_minput.py to start the training; parameter setting and training script refer to exp.sh

Test & Visualization

Use $ python test.py to test; parameter setting and script can refer to exp.sh

Note:

seg18 and pixcontrast_18 are for EndoVis18; segcata and pixcontrast_cata are for CaDIS. Here, we take EndoVis18 as the example. The usage for CaDIS is similar.

Citation

@ARTICLE{9779714,
  author={Jin, Yueming and Yu, Yang and Chen, Cheng and Zhao, Zixu and Heng, Pheng-Ann and Stoyanov, Danail},
  journal={IEEE Transactions on Medical Imaging}, 
  title={Exploring Intra- and Inter-Video Relation for Surgical Semantic Scene Segmentation}, 
  year={2022},
  volume={},
  number={},
  pages={1-1},
  doi={10.1109/TMI.2022.3177077}
}

Questions

For further question about the code or paper, please contact 'ymjin5341@gmail.com'

YuemingJin/STswinCL