This code is the Pytorch implementation of RNM forked from tgc1997 and the Hindi MSR-VTT dataset is created by alokssingh. Modification in the original code are made for the compatiablity with Hindi text. This implementation of RMN is used as a baseline model.
- Python 3.7.3 (other versions may also work)
- Pytorch 1.4.0 (other versions may also work)
- pickle
- tqdm
- h5py
- matplotlib
- numpy
- tensorboard_logger
- CUDA 10.1
- Download visual features from MSR-VTT and text features from MSR-VTT-Hindi-text and put them in
data
folder. - Download evauation tool from caption-eval
python train.py --dataset=msr-vtt --model=RMN --result_dir=results/msr-vtt_model --use_lin_loss \
--learning_rate_decay --learning_rate_decay_every=5 --learning_rate_decay_rate=3 \
--use_loc --use_rel --use_func --use_multi_gpu --learning_rate=1e-4 --attention=gumbel \
--hidden_size=1300 --att_size=1024 --train_batch_size=32 --test_batch_size=8
python evaluate.py --dataset=msr-vtt --model=RMN --result_dir=results/msr-vtt_model \
--use_loc --use_rel --use_func --hidden_size=1300 --att_size=1024 \
--test_batch_size=2 --beam_size=2 --eval_metric=CIDEr
NOTE: For METEOR score we have used meteor_indic and indic_tokenizer for tokenization