Document of AI-Endo

This is the pytorch implementation of paper "Intelligent Surgical Workflow Recognition for Endoscopic Submucosal Dissection with Real-time Animal Study" by Jianfeng Cao, Hon-Chi Yip, Yueyao Chen, Markus Scheppach, Xiaobei Luo, Hongzheng Yang, Ming Kit Cheng, Yonghao Long, Yueming Jin, Philip Wai-Yan Chiu, Yeung Yam, Helen Mei-Ling Meng, and Qi Dou.

Dependency installation

The model is developed based on pytorch. To install dependencies, run

git clone https://github.com/med-air/AI-Endo.git
cd AI-Endo
conda env create -f environment.yml
conda activate AI-Endo

Data preparation

AI-Endo is trained with downsampled images of endoscopic video. The user may access data examples from [figshare](https://doi.org/10.6084/m9.figshare.23506866.v5), which should be downloaded and arranged locally as

DATA_ROOT--|
           |--Images--|
           |          |--Video1--|
           |          |          |--Image00001.png
           |          |          |--Image00002.png
           |          |...
           |
           |--Labels--|--Phase1.txt
                      |--Phase2.txt
                      |...

DATA_ROOT represents the root folder of the dataset and should be set in the config file, e.g., configs/test.yml, accordingly.

Train

The training process of AI-Endo includes two stages, ResNet50 and Fusion+Transformer. To execute the training process, the dataset should be specified in the config file ./configs/train.yml, such as paths of downsampled video at 1 fps and its corresponding annotations.

python get_paths_labels.py
python train_all.py --cfg train

Prediction

Set the file paths of trained models in ./configs/test.yml and run

# Option 1: offline prediction
python test_all.py --cfg test_offline

# Option 2: online prediction
python online.py -s --cfg test

Pretrained mdoels are available at Google Drive.

Acknowledgment

The code of this repository is partially referred to Trans-SVNet and TMRNet.

Citation

TBD

Correspondence

For further question about the code, please contact jianfeng13.cao@gmail.com.

LICENSE

This project is covered under the MIT License.