/Video-Moment-Retrieval-Papers

Summary of the papers related to video moment retrieval / video grounding / video moment localization ...

Video-Moment-Retrieval-Papers

Summary of the papers related to video moment retrieval / video grounding / video moment localization ...

2017

The beginning of this stream

TALL / CTRL

MCN

2018

2019

2020

2021

Dataset

New

Benchmark Results % (2021)

ActivityNet Captions

R@1 IoU@0.1 R@1 IoU@0.3 R@1 IoU@0.5 R@1 IoU@0.7 R@5 IoU@0.1 R@5 IoU@0.3 R@5 IoU@0.5 R@\5 IoU@0.7 Note
IVG - 63.22 43.84 27.10 - - - -
TCN+DCM - - 44.9 27.7 - - - -
CI-MHA - 61.49 43.97 25.13 - - - -
FVMR - 60.63 45.00 26.85 - 86.11 77.42 61.04
BPNet - 58.98 42.07 24.69 - - - -
CABL - 66.34 48.12 27.60 - 88.91 79.32 63.41
LOCFORMER - 60.61 43.74 27.04 - - - -

Charades-STA

R@1 IoU@0.1 R@1 IoU@0.3 R@1 IoU@0.5 R@1 IoU@0.7 R@5 IoU@0.1 R@5 IoU@0.3 R@5 IoU@0.5 R@\5 IoU@0.7 Note
MMRG 88.27 71.60 44.25 - 92.35 87.67 60.22 -
LoGAN - 51.67 34.68 14.54 - 92.74 74.30 39.11 weakly-supervised
IVG - 67.63 50.24 32.88 - - - -
TCN+DCM - - 59.7 34.4 - - - - I3D
CI-MHA - 69.87 54.68 35.27 - - - -
FVMR - - 55.01 33.74 - - 89.17 57.24 I3D
I2N-I3D - - 52.28 31.32 - - 80.65 54.17
BPNet - 65.48 50.75 31.64 - - - -
LOCFORMER - 71.88 58.52 38.51 - - - -

DiDeMo

R@1 IoU@0.1 R@1 IoU@0.3 R@1 IoU@0.5 R@1 IoU@0.7 R@5 IoU@0.1 R@5 IoU@0.3 R@5 IoU@0.5 R@\5 IoU@0.7
TCN+DCM - - - 37.5 - - - -

TACoS

R@1 IoU@0.1 R@1 IoU@0.3 R@1 IoU@0.5 R@1 IoU@0.7 R@5 IoU@0.1 R@5 IoU@0.3 R@5 IoU@0.5 R@\5 IoU@0.7 Note
MMRG 85.34 57.83 39.28 - 84.37 78.38 56.34 - * the figures look strangely high
IVG 49.36 38.84 29.07 19.05 - - - -
FVMR 53.12 41.48 29.12 - 78.12 64.53 50.00 -
I2N-I3D - 31.47 29.25 - - 52.65 46.08 -
BPNet - 25.96 20.96 14.08 - - - - * the figures look strangely low
CABL 49.16 38.98 27.65 - 73.12 59.96 46.24 -
  • maybe the experiment setting is different

Weakly-supervised methods

Comparison of some weakly-supervised methods on Charades-STA

R@1 IoU@0.1 R@1 IoU@0.3 R@1 IoU@0.5 R@1 IoU@0.7 R@5 IoU@0.1 R@5 IoU@0.3 R@5 IoU@0.5 R@\5 IoU@0.7 Note
LoGAN - 51.67 34.68 14.54 - 92.74 74.30 39.11 weakly-supervised
TGA - 29.68 17.04 6.93 - 83.87 58.17 26.80 weakly-supervised
SCN - 42.96 23.58 9.97 - 95.56 71.80 28.87 weakly-supervised

LoGAN: Tan, Reuben, et al. "Logan: Latent graph co-attention network for weakly-supervised video moment retrieval." Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2021.

TGA: Niluthpol Chowdhury Mithun, Sujoy Paul, and Amit K Roy- Chowdhury. Weakly supervised video moment retrieval from text queries. In Proceedings ofthe IEEE Conference on Com- puter Vision and Pattern Recognition, pages 11592–11601, 2019.

SCN: Zhijie Lin, Zhou Zhao, Zhu Zhang, Qi Wang, and Huasheng Liu. Weakly-supervised video moment retrieval via semantic completion network. In Proceedings ofthe AAAI Conference on Artificial Intelligence, 2020.

Benchmark Results % (2019 and before)

ActivityNet Captions

R@1 IoU@0.1 R@1 IoU@0.3 R@1 IoU@0.5 R@1 IoU@0.7 R@5 IoU@0.1 R@5 IoU@0.3 R@5 IoU@0.5 R@\5 IoU@0.7 Note
MCN 42.80 21.37 9.58 - - - - -
CTRL 49.09 28.70 14.0 - - - - -
ACRN 50.37 31.29 16.17 - - - - -
QSPN - 45.3 27.7 13.6 - 75.7 59.2 38.3
TGN 70.06 45.51 28.47 - 79.10 57.32 44.20 -
SCDM - 54.80 36.75 19.86 - 77.29 64.99 41.53
CBP - 54.30 35.76 17.80 - 77.63 65.89 46.20
TripNet - 48.42 32.19 13.93 - - - - RL
ABLR 73.30 55.67 36.79 - - - - - RL
ExCL - 63.30 43.6 24.1 - - - -
PFGA 75.25 51.28 33.04 19.26 - - - -
WSDEC-X(Weakly) 62.7 42.0 23.3 - - - - -
WSLLN (Weakly) 75.4 42.8 22.7 - - - - -

Charades-STA

R@1 IoU@0.1 R@1 IoU@0.3 R@1 IoU@0.5 R@1 IoU@0.7 R@5 IoU@0.1 R@5 IoU@0.3 R@5 IoU@0.5 R@\5 IoU@0.7 Note
CTRL - - 23.63 8.89 - - 58.92 29.52
ABLR - - 24.36 9.01 - - - -
SMRL - - 24.36 11.17 - - 61.25 32.08
ACL-K - - 30.48 12.20 - - 64.84 35.13
SAP - - 27.42 13.36 - - 66.37 38.15
QSPN - 54.7 35.6 15.8 - 95.8 79.4 45.4
MAN - - 46.53 22.72 - - 86.23 53.72
SCDM - - 54.44 33.43 - - 74.43 58.08
CBP - - 36.80 18.87 - - 70.94 50.19
TripNet - 51.33 36.61 14.50 - - - - RL
ExCL - 65.1 44.1 23.3 - - - - RL
PFGA - 67.53 52.02 33.74 - - - -

DiDeMo

R@1 IoU@0.1 R@1 IoU@0.3 R@1 IoU@0.5 R@1 IoU@0.7 R@5 IoU@0.1 R@5 IoU@0.3 R@5 IoU@0.5 R@\5 IoU@0.7
TMN 22.92 - - - 76.08 - - -
MCN 28.10 - - - 78.21 - - -
TGN 28.23 - - - 79.26 - - -
MAN 27.02 - - - 81.70 - - -
WSLLN (Weakly) 19.4 - - - 54.4 - - -

TACoS

R@1 IoU@0.1 R@1 IoU@0.3 R@1 IoU@0.5 R@1 IoU@0.7 R@5 IoU@0.1 R@5 IoU@0.3 R@5 IoU@0.5 R@\5 IoU@0.7 Note
MCN 2.62 1.64 1.25 - 2.88 1.82 1.01 -
CTRL 24.32 18.32 13.30 - 48.73 36.69 25.42 -
TGN 41.87 21.77 18.90 - 53.40 39.06 31.02 -
ACRN 24.22 19.52 14.62 - 47.42 34.97 24.88 -
ACL-K 31.64 24.17 20.01 - 57.85 42.15 30.66 -
SCDM - 26.11 21.17 - - 40.16 32.18 -
CBP - 27.31 24.79 19.10 - 43.64 37.40 25.59
TripNet - 23.95 19.17 9.52 - - - - RL
SMRL 26.51 20.25 15.95 - 50.01 38.47 27.84 - RL
ABLR 34.7 19.5 9.4 - - - - - RL
ExCL - 45.5 28.0 14.6 - - - -