/MS_RAFT_plus

[ECCV RVC 2022 Winner] Inference code for MS-RAFT+, the 2022 Robust Vision Challenge winner in the category "Optical Flow"

Primary LanguagePythonOtherNOASSERTION

MS_RAFT_plus

In this repository we release (for now) the inference code for our work:

High Resolution Multi-Scale RAFT (Robus Vision Challenge 2022)
Robust Vision Challenge 2022
Azin Jahedi, Maximilian Luz, Lukas Mehl, Marc Rivinius and Andrés Bruhn

If you find our work useful please cite via BibTeX.

This work builds upon MS_RAFT.

Requirements

The code has been tested with PyTorch 1.10.2+cu113. Install the required dependencies via

pip install -r requirements.txt

Alternatively you can also manually install the following packages in your virtual environment:

  • torch, torchvision, and torchaudio (e.g., with --extra-index-url https://download.pytorch.org/whl/cu113 for CUDA 11.3)
  • matplotlib
  • scipy
  • tensorboard
  • opencv-python
  • tqdm
  • parse

Pre-Trained Checkpoints

You can download our pre-trained model from the releases page.

Datasets

Datasets are expected to be located under ./data in the following layout:

./data
  ├── kitti15                   # KITTI 2015
  │  └── dataset
  │     ├── testing/...
  │     └── training/...
  ├── middlebury                # Middlebury
  │  ├── test/...
  │  │  └── img/...
  │  └── training/...
  │     ├── flow/...
  │     └── img/...
  ├── sintel                    # Sintel
  │  ├── test/...
  │  └── training/...
  └── viper                     # Viper
     ├── test/img/...
     └── val
        ├── flow/...
        └── img/...

Running MS_RAFT_plus

For running MS_RAFT_plus on MPI Sintel images you need about 4 GB of GPU VRAM.

To compile the CUDA correlation module run the following once:

cd alt_cuda_corr && python setup.py install && cd ..

And then you can evaluate the pre-trained model via:

python evaluate.py --model mixed.pth --dataset sintel --cuda_corr

Note that the above-mentioned (with --cuda_corr) code performs on-demand cost computation and does not pre-compute the cost volume, because such computation is very memory intensive on high resolutions.

License

  • Our code is licensed under the BSD 3-Clause No Military License. See LICENSE.
  • The provided checkpoint is under the CC BY-NC-SA 3.0 license.

Acknowledgement

Parts of this repository are adapted from RAFT (license). We thank the authors for their excellent work.