In repo contains the code for "Discovering Spatio-Temporal Rationales for Video Question Answering"
Anaconda 4.10.3, python 3.7.13, pytorch 1.11.0 and cuda 11.3. For other libs, please refer to the file requirements.txt.
Please create an env for this project using anaconda (should install anaconda first)
>conda create -n videoqa python==3.7.13
>conda activate videoqa
>pip install -r requirements.txt
Please download QA annotations from NExT , Causal-Vid.
After preparing the feature, please put the data under the folder ['video_feature']
accordingly. Furthermore, you can modified the path in ['Dataloader.py'] to load the feature.
Once the data is ready, you can easily run the code. There are four folders whose names reprensent datasets. You can enter the folder accordingly. After entering a specific folder:
If you want to train the model, please run
python train.py -v=train -m=train
It will train the model and save to ['models'].