LRTD: Long-Range Temporal Dependency based Active Learning for Surgical Workflow Recognition

arxiv.org/pdf/2004.09845.pdf (IPCAI2020) Xueying Shi, Yueming Jin, Qi Dou, and Pheng-Ann Heng

Introduction

The LRTD repository contains the codes of our LRTD paper. We validate our approach on a large surgical video dataset Cholec80 by performing surgical workflow recognition task. By using our LRTD based selection strategy, we can outperform other state-ofthe-art active learning methods who only consider neighbor-frame information. Using only up to 50% of samples, our approach can exceed the performance of full-data training

Fig. 3: LRTD based sample selection. LRTD comes from the non-local cross-frame dependency score that is computed by dependency matrix Mmn for clip XT in Eq. 4.

Requirements

python 3.6.9
torch 0.4.1

Usage

download data from Cholec80 and then split the data into 25fps using ffmpeg.
```
sh split_video_to_image.sh
```
Resize the image from 1920 x 1080 to 250 x 250

after creating data, my data folders would like the following structure

 ├── chorec80 
 |   ├── data_frames(put the raw image frames in this folder)
 |       ├── video01
 |          ├── video01-1.jpg
 |          ├── ...
 |          ├── video01-xxx.jpg
 |       ├── videoxxx
 |       ├── ...
 |       ├── video80
 |   ├── data_resize(put the resized image frames in this folder)	   
 |       ├── video01
 |          ├── video01-1.jpg
 |          ├── ...
 |          ├── video01-xxx.jpg
 |       ├── videoxxx
 |       ├── ...
 |       ├── video80
 |   ├── phase_annotations
 |       ├── video01-phase.txt
 |       ├── ...
 |       ├── video80-phase.txt
 |   ├── tool_annotations
 |       ├── video01-tool.txt
 |       ├── ...
 |       ├── video80-tool.txt

split data into train, val, test data (in our setting, we would downsample from 25fps to 1fps when spliting data)
```
python get_paths_labels.py .
```
select partial data to train

we initialized with randomly selected 10% data from the unlabelled sample pool. The selected data is stored in nonlocalselect_txt folder. Note that the select_chose can be 'DBN'(comparasion method) or 'non_local', for the first 10% data, we all use random selection, from the next selection, we separately use 'DBN' or 'non_local'. So the first 10% data select, we set '--is_first_selection=True' in ./nonlocalselect.sh, and can igore other parameter listed ./nonlocalselect.sh becuase we have set a break point after data selection, your can directly quit the program when meeting the break point, it means that you finished data selection. From 20%-50% data selection, we should comment '--is_first_selection=True'. Moreover, we should change '--val_model_path' to indicate which model as valiation model for the rest of the data. For example, we have already trained a model using 10% data, we use set this mode as validation model to select the next 10% data.
```
./nonlocalselect.sh to select data. 
```

for training of ResLSTM backbone, set '--json_name' to indicate which butch of data you want to use, where json file is store in nonlocalselect_txt folder.
```
./train_nolocalselect_ResNetLSTM.sh
```
for training of ResLSTM-Nonlocal backbone, change '--FT_checkpoint' as the previous ResLSTM model stored in results_ResLSTM_Nolocal/roundx/RESLSTM folder.
```
./train_nolocalselect_ResNetLSTM_nolocalFT.sh 
```

for testing

python test_singlenet_phase_+nonlocal.py -c 0 -n model(stored in results_ResLSTM_nolocal/roundx/RESLSTM_NOLOCAL folder)

Citation

If the code is helpful for your research, please cite our paper.

@inproceedings{shi2020lrtd,
title={LRTD: Long-Range Temporal Dependency based Active Learning for Surgical Workflow Recognition},
author={Xueying Shi, Yueming Jin, Qi Dou, and Pheng-Ann Heng},
year={2020},
booktitle={International Conference on Information Processing in Computer-Assisted Interventions (IPCAI)},
publisher={Springer}
}

xmichelleshihx/AL-LRTD

LRTD: Long-Range Temporal Dependency based Active Learning for Surgical Workflow Recognition

Introduction

Requirements

Usage

Citation