STFT: Spatial-Temporal Feature Transformation

By Lingyun Wu, Zhiqiang Hu, Yuanfeng Ji, Ping Luo, Shaoting Zhang.

This repo is an PyTorch implementation of "Multi-frame Collaboration for Effective Endoscopic Video Polyp Detection via Spatial-Temporal Feature Transformation", accepted by MICCAI 2021.

This repository contains the implementation of our approach STFT and several other video object detection algorithms like FGFA, RDN, and MEGA based on mega.pytorch, as well as training and testing scripts to reproduce the results on Endoscopic Video Datasets reported in our paper.

News

[2021/11/12] For the implementation on the ImageNet VID dataset, please refer to here.
[2021/09/21] Implementation for other video-based methods on Endoscopic Video Datasets released.
[2021/09/21] Release training/testing scripts and the pretrained model for STFT.
[2021/06/29] Create repository.

Model Zoo

Supported backbones:

ResNet

Supported image-based methods:

FCOS (ICCV2019)
RetinaNet (ICCV2017)
BorderDet (ECCV2020)

Supported video-based methods:

FGFA (ICCV2017)
RDN (ICCV2019)
MEGA (CVPR2020)
STFT (MICCAI2021)

Installation

Please follow INSTALL.md for installation instructions.

Usage

Please follow GetStarted.md for usage instructions.

Citing STFT

Any new methods are welcomed. We also hope this repository would help further research in the field of video object detection and beyond. Please cite our paper in your publications if it helps your research:

@inproceedings{wu2021multi,
  title={Multi-frame collaboration for effective endoscopic video polyp detection via spatial-temporal feature transformation},
  author={Wu, Lingyun and Hu, Zhiqiang and Ji, Yuanfeng and Luo, Ping and Zhang, Shaoting},
  booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
  pages={302--312},
  year={2021},
  organization={Springer}
}

Contributing to the project

Any pull requests or issues are welcomed.