TextPolyp: Point-Supervised Polyp Segmentation with Text Cues (MICCAI2024)

Authors:

*Yiming Zhao*, Yi Zhou, Yizhe Zhang, Ye Wu, and Tao Zhou.

1. Preface

This repository provides code for "TextPolyp: Point-Supervised Polyp Segmentation with Text Cues" MICCAI 2024. Paper

2. Overview

2.1. Introduction

Polyp segmentation in colonoscopy images is essential for preventing Colorectal cancer (CRC). Existing polyp segmentation models often struggle with costly pixel-wise annotations. Conversely, datasets can be annotated quickly and affordably using weak labels such as points. However, utilizing sparse annotations for model training remains challenging due to the limited information. We propose TextPolyp to tackle this issue by leveraging only point annotations and text cues for effective weakly-supervised polyp segmentation. Specifically, we utilize the Grounding DINO algorithm and Segment Anything Model (SAM) to generate initial pseudo-labels, which are then refined with point annotations. Furthermore, we employ a SAM-based mutual learning strategy to effectively enhance segmentation results from SAM. Our TextPolyp model is versatile and can seamlessly integrate with various backbones and segmentation methods.

2.2. Framework Overview

Figure 1: Overview of the proposed framework. Our approach involves utilizing SegNet and SAM to produce masks(S_bas and S_ori) from a given image, while the gamma corrected image also serves as input for SAM to generate S_gam.

2.3. Qualitative Results

Figure 2: Visualization results of different methods on the polyp segmentation.

3. Proposed Method

3.1. Training/Testing

The training and testing experiments are conducted using PyTorch with one NGeForce RTX3090 GPU with 24 GB Memory.

Configuring your environment (Prerequisites):

Note that PraNet is only tested on Ubuntu OS with the following environments. It may work on other operating systems as well but we do not guarantee that it will.
- Creating a virtual environment in terminal: conda create -n TextPolyp python=3.6.
- Installing necessary packages: pip install -r requirements.txt.
Downloading necessary data:
- downloading Polyp dataset which can be found from Google Drive Link, or Baidu Drive (extraction code: esgj).
- downloading Grounding DINO for Pesudo-label generation which can be found from GitHub, and the process of generating pseudo-labels by combining point with SAM is located in the ./pseudo folder.
- downloading SAM weights and move it into ./checkpoints/sam_vit_b_01ec64.pth, which can be found from GitHub.

Preprocessing:

Download training and data and put them into ./data folder with the following structure:

|-- data
|   |-- TrainDB
|       |-- image
|       |-- point
|       |-- gamma
|       |-- pseudo
|   |-- TestDB
|       |-- CVC-300
|       |-- CVC-ClinicDB
|       |-- CVC-ColonDB
|       |-- ETIS-LaribPolypDB
|       |-- Kvasir

Download the pre-trained model.

Training:
- After processing, just run python train.py to train our model.
Testing:
- After you download the testing dataset and put them into ./data , just run python test.py to generate the final prediction maps.

4. Citation

Please cite our paper if you find the work useful, thanks!

@inproceedings{zhao2024textpolyp,
title={TextPolyp: Point-Supervised Polyp Segmentation with Text Cues},
author={Zhao, Yiming and Zhou, Yi and Zhang, Yizhe and Wu, Ye and Zhou, Tao},
booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
pages={711--722},
year={2024},
organization={Springer}
}

5. License

The source code and dataset are free for research and education use only. Any comercial use should get formal permission first.

⬆ back to top

taozh2017/TextPolyp