This is the PyTorch Implementation of
- Chenyou Fan, Xiaofan Zhang, Shu Zhang, Wensheng Wang, Chi Zhang, Heng Huang. Heterogeneous Memory Enhanced Multimodal Attention Model for VideoQA. In CVPR, 2019. [link]
@inproceedings{fan-CVPR-2019,
author = {Chenyou Fan, Xiaofan Zhang, Shu Zhang, Wensheng Wang, Chi Zhang, Heng Huang},
title = "{Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering}"
booktitle = {CVPR},
year = 2019
}
TGIF-QA, see gif-qa/
MSVD-QA, see msvd-qa/
Youtube2text, see zh-qa/
Python = 2.7
PyTorch = 1.0 [here]
GPU training with 4G+ memory, testing with 1G+ memory.