Official Code Implementation of the paper : Video and Text Matching with Conditioned Embeddings
https://arxiv.org/abs/2110.11298

Datasets :

We employ the following datasets in our work:

  1. Acitivtynet Captions, the pre-extracted features can be downloaded by clicking here.
  2. Didemo , the pre-extracted features can be downloaded by clicking here
  3. Vatex click here.
  4. MSR-VTT can can be downloaded by clicking here
  5. YouCook2 . the preextracted features can be downloaded here
  6. LSMDC click here

Training :

Example training command on Activitynet :
python train.py anet_precomp --feat_name i3d --img_dim 2048 --norm