Adversarial Video Moment Retrieval by Jointly Modeling Ranking and Localization (AVMR)

This is our implementation for the paper:

Da Cao, Yawen Zeng, Xiaochi Wei, Liqiang Nie, Richang Hong, Zheng Qin. 2020. Adversarial Video Moment Retrieval by Jointly Modeling Ranking and Localization. In The ACM International Conference on Multimedia (ACM MM '20). ACM, Seattle, United States.

Environment Settings

We use the framework pytorch.

  • pytorch version: '1.2.0'
  • python version: '3.5'

AVMR

The released code consists of the following files.

--workplaceTest
--workplaceTrain
--feature_all
--main.py
--MADDPG.py
--model.py
--memory.py
--utils.py
--randomProcess.py
--readme

Example to run the codes

Run AVMR:

python main.py

Example demo

AVMR_demo_YVKIV.mp4

The visualization of five generated video moments on five iterations is revealed above. The green line and green box indicate the ground truth, the black line and black box represent the generated video moment candidates.

Dataset

We provide two processed datasets: Charades-STA && TACoS The strategy of multi-scale sliding windows is utilized to segment each video with the size of [64, 128, 256, 512] frames with 80% overlap and we randomly selected 80% and 20% of them for training and testing, respectively.

All features are saved in ./feature_all_train, ./feature_all_test.

Workspace

/workplaceTest and /workplaceTrain that processes videos in advance as a runtime workspace.

Last Update Date: Jul 28, 2020