Temporal Memory Relation Network for Workflow Recognition from Surgical Video

by Yueming Jin, Yonghao Long, Cheng Chen, Zixu Zhao, Qi Dou, Pheng-Ann Heng.

Introduction

The Pytorch implementation for our paper 'Temporal Memory Relation Network for Workflow Recognition from Surgical Video', accepted at IEEE Transactions on Medical Imaging (TMI).

Data Preparation

We use the dataset Cholec80 and M2CAI 2016 Challenge.
Training and test data split

Cholec80: first 40 videos for training and the rest 40 videos for testing, following the original paper EndoNet.

M2CAI: 27 videos for training and 14 videos for testing, following the challenge evaluation protocol.
Data Preprocessing:

Using FFmpeg to convert the videos to frames;
Downsample 25fps to 1fps (Or can directly set the convert frequency number as 1 fps in the previous step);
Cut the black margin existed in the frame using the function of change_size() in video2frame_cutmargin.py;

Note: You also can directly use ``video2frame_cutmargin.py`` for step 1&3, you will obtain the cutted frames with original fps.

Resize original frame to the resolution of 250 * 250.

The structure of data folder is arranged as follows:

(root folder)
├── data
|  ├── cholec80
|  |  ├── cutMargin
|  |  |  ├── 1
|  |  |  ├── 2
|  |  |  ├── 3
|  |  |  ├── ......
|  |  |  ├── 80
|  |  ├── phase_annotations
|  |  |  ├── video01-phase.txt
|  |  |  ├── ......
|  |  |  ├── video80-phase.txt
├── code
|  ├── ......

Setup & Training

Check dependencies:

- pytorch 1.0+
- opencv-python
- numpy
- sklearn

Clone this repo

git clone https://github.com/YuemingJin/TMRNet

Training model for building memory bank

Switch folder $ cd ./code/Training memory bank model/
Run $ get_paths_labels.py to generate the files needed for the training
Run $ train_singlenet_phase_1fc.py to start the training

Training TMRNet

Switch folder $ cd ./code/Training TMRNet/
Put the well-trained model obtained from step 2 to folder ./LFB/FBmodel/
Run $ get_paths_labels.py to generate the files needed for the training
Set the args 'model_path' in train_*.py to ./LFB/FBmodel/{your_model_name}.pth

Run $ train_*.py to start the training

Note: In the first time to run train_*.py files, set the args 'load_LFB' to False to generate the memory bank
We have three configurations about train_*.py:
1.train_only_non-local_pretrained.py: only capture long-range temporal pattern (ResNet);
2.train_non-local_mutiConv_resnet.py: capture long-range multi-scale temporal pattern (ResNet);
3.train_non-local_mutiConv_resnest.py: capture long-range multi-scale temporal pattern (ResNeSt), achieving the best results.

Testing

Our trained models can be downloaded from Dropbox.

Switch folder $ cd ./code/eval/python/
Run $ get_paths_labels.py to generate the files needed for the testing
Specify the feature bank path, model path and test file path in ./test_*.py
Run ./test_*.py to generate results.
Run ./export_phase_copy.py to export results as txt files.

We use the evaluation protocol of M2CAI challenge for evaluating our method.

Switch folder $ cd ./code/eval/result/matlab-eval/
Run matlab files ./Main_*.m to evaluate and print the result.

Citation

If this repository is useful for your research, please cite:

@ARTICLE{9389566,  
  author={Jin, Yueming and Long, Yonghao and Chen, Cheng and Zhao, Zixu and Dou, Qi and Heng, Pheng-Ann},  
  journal={IEEE Transactions on Medical Imaging},   
  title={Temporal Memory Relation Network for Workflow Recognition From Surgical Video},
  year={2021},  
  volume={40},  
  number={7},  
  pages={1911-1923},  
  doi={10.1109/TMI.2021.3069471}
}

Questions

For further question about the code or paper, please contact 'ymjin5341@gmail.com'