This project is part of a submodule of the VARAID project. VARAID-VSR is a project which is forked from the project Zooming Slow-Mo. We add further adjustments in this forked project, such as segmentation integration during training to better cater for Football broadcasts. We also trained this network on our FootballVids and PlayerVids dataset.
git clone https://github.com/nadimra/vsr
- You must install have a specific version of Pytorch (1.9) installed to correspond to be compatible with the DCNv2 module which is integrated in this project. A GPU is also required. The choice of CUDA is dependant on your system, but for this project, we used CUDA 11.1 since it is available on Imperials machines. To follow the same installation as us, install
pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
. Note: Uninstall torch, torchaudio and torchvision if those packages are already in the environment and is not version 1.9. - Navigate to the root of the directory and install the rest of the packages
pip install -r requirements.txt
- Before compiling DCNv2, you must ensure that you set the configuration of the CUDA directories. If you are following the exact installation process in the same system, then the following will suffice:
export CUDA_HOME=/vol/cuda/11.1.0-cudnn8.0.4.30
export CUDNN_INCLUDE_DIR=/vol/cuda/11.1.0-cudnn8.0.4.30/include
export CUDNN_LIB_DIR=/vol/cuda/11.1.0-cudnn8.0.4.30/lib64
- Navigate to
/codes/models/modules/DCNv2
and runbash make.sh
- Create a folder
ckpts
within the root directory and place your model paths here. The trained models for this project can be found in our Model Zoo. Note: If using our VarAid application, you can download ModelQ.pth and ModelR.pth for a x2 and x4 model (keep the same file names). If you prefer to use another model, then you will need to adjust the code inside the varaid repository.
Inside the codes
directory, run the following (Edit the file to ensure the file paths are correct):
test.py
Training documentation not available.
The left images are the upscaled LR images which have been set to half the frame rate. The right images are the HR images produced by our final model.
The left images are the upscaled LR images. The right images are the HR images produced by our final model. We apply both sets of images to the YOLOv5 object detector to detect the ball objects.
The left images are the upscaled LR images. The right images are the HR images produced by our final model. We apply both sets of images to the HRNET pose estimation network.
This code is built on Zooming Slow-Mo. We also utilise Semantic Segmentation on MIT ADE20K to aid the training phase. We thank the authors for sharing their codes.